
i^M^^ln re Patent Application of: 
Lisa A. Neuhold et al. 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



Docket No.: 00630/1 00D532-US1 

(PATENT) 



Application No.: 09/717,450 



Art Unit: 1632 



Filed: November 20, 2000 



Examiner: M. C. Wilson 



For: TRANSGENIC ANIMAL MODEL FOR 

DEGENERATIVE DISEASES OF 
CARTILAGE 

DECLARATION OF DR. ROGER ASKEW UNDER 37 C.F.R. $ 1.132 

MS Amendment 
Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Dear Sir: 

I, ROGER ASKEW, do hereby declare and state the following: 

1. I, Roger Askew, am a citizen of the United States, and I am more than twenty-one 
years of age. 

2. I make this declaration in support of the above-identified application ("the '450 
application"). 

3. I presently hold the position of Director of Molecular Genetics at Wyeth 
Research, Andover, Massachusetts, and have held this position for three years. 



{W:\00630\100D532US1\00509102.DOC IHDIiniUlinillHIlI } E rrori Unknown document property name. 

BEST AVAILABLE COPY 



4. My qualifications as a scientist, and in particular in the field of transgenic animals 
and gene-targeting, are set forth on the copy of my curriculum vitae, which is attached as Exhibit 
A.. 

5. I have read and am familiar with the specification of the '450 application as filed, 
the Final Office Action mailed November 11, 2004 in connection with this application, 
Applicants' Response to Final Action mailed January 2, 2005, the Declaration of Lisa A. 
Neuhold, Ph.D. under 37 C.F.R. 1.132 mailed on April 6, 1999 in connection with the parent of 
the c 450 application (U.S. Serial No. 08/994,689; "the '689 application"), and the Second 
Declaration of Lisa A. Neuhold, Ph.D. under 37 C.F.R. 1.132 mailed on August 31, 2000 in 
connection with the '689 application. 

6. It is my understanding that the Examiner believes that the '450 application does not 
teach chondrocyte-specific promoters and that one of ordinary skilled molecular biologist would 
not know how to use such promoters in the context of the '450 application's invention. The 
following paragraphs (7-8) describe sections of the '450 application and references demonstrate 
that chondrocyte-specific promoters are adequately described in the '450 application and were 
known by ordinary skilled molecular biologists at the time the '689 application was filed (1997). 

7. There is a description of promoters that direct transcription in joint tissues, such as 
chondrocyte-specific promoters, i.e., that provide spatial control of expression, in the '450 
application (see for example, page 15, line 19 to page 16, line 6). The application discloses that 
expression of a matrix decoding enzyme (MDE) in chondrocytes, which are the cells found in 

{W:\00630\100D532US1\00509102.DOC IHIIinilIIllllllDDllD }Errorl Unknown document property name. 



articular cartilage of the joint, results in localized degradation of extracellular matrix proteins. 
Having established this principle with a working example (the Type II collagen promoter), one of 
ordinary skill in the art would recognize that joint (i.e., chondrocyte) tissue-specific expression of 
an MDE, particularly a collagen II-degrading MMP, would yield the desired joint degradation 

8. As evidence that chondrocyte-specific promoters were known by those of ordinary skill 
in the art at the time of the present invention, I provide and discuss references dating from the 
approximate time of the invention exemplifying the highly characterized nature of other 
chondrocyte-specific promoters. The first reference, Pirok et al. 9 Structural and Functional 
Analysis of the Chick Chondroitin Sulfate Proteoglycan (Aggrecan) Promoter and Enhancer 
Region. The Journal of Biological Chemistry. Vol. 272, No. 17, pp. 11566-11574, 1997 (attached 
as Exhibit B) discloses mapping of 5* segments (enhancer and silencer, i.e. promoter, elements) of 
a cartilage-specific gene: the chondroitin sulfate proteoglycan (CSPG; Aggrecan) gene. These 
promoter elements are responsible for the chondrocyte-specific expression of the CSPG gene. 
Another reference, which was received by the journal in which it is published in May 1998 and 
thus clearly represents work done in 1997, (Lefebvre et al., A New Long Form of Sox5 (L-Sox5), 
Sox6 and Sox9 are coexpressed in chondrogenesis and cooperatively activate the type II collagen 
gene, The Embo Journal. Vol. 17, No. 19, pp. 5718-5733, 1998; "Lefebvre"; attached as Exhibit 
C), discloses cooperation of transcription factors Sox5 and Sox6 in providing chondrocyte-specific 
expression of the Collal gene. Specifically, this reference describes the elements of at least two 
promoters, the Collal and Colllal promoters, which lead to chondrocyte-specific gene expression 
(see, e.g., p. 5719, col. 1 first and second full paragraphs of Lefebvre). Accordingly, not only do 
these references describe chondrocyte-specific promoters known in the art at the time of the 
invention, but they also describe the characterization of the molecular features of these promoters. 

{W:\00630\1 00D532US1 \005091 02.DOC IIIBDMIilHIflilBD } E rrorl unknown document property name. 



In conclusion, both of the references described chondrocyte-specific promoters known in the art at 
the time of the present invention. 

9. Additionally, it would be routine for one of ordinary skill in the art to identify the 
promoter(s) responsible for chondrocyte-specific expression, e.g. promoter of the chondrocyte- 
specific genes in references above. For example, simple p-galactosidase expression experiments 
would demonstrate whether a promoter were a chondrocyte-specific promoter. 

10. It is my understanding that the Examiner believes that the '450 application does not 
teach transgenic non-human mammals other than mice and that an ordinary skilled molecular 
biologist would not know how to use such transgenic non-human mammals, particularly rats, in 
the context of the '450 application's invention. In coming to this conclusion, the Examiner cites 
several references (e.g. Mullins et aL, Nature, 1990, 344:541-544; Hammer et al, Cell, 1990, 
63:1099-1112; Mullins et al y Hypertension, 1993, 22(4): 630-633), asserting that these references 
demonstrate the unpredictability of developing transgenics. However, none of the references 
pointed to by the Examiner are directed to the same system targeted in the transgenic mammal 
(e.g., rat) of the present invention. The instant invention uses a constitutively enzymatically 
active human matrix metalloproteinase to cleave type II collagen. Degradation of type II collagen 
is not highly species dependent because Type II collagen is highly conserved between species. 
Degradation of type II collagen is not as complicated as phenotypes of references cited by 
Examiner (e.g. hypertension: see Mullins et al, Nature, 1990, 344:541-544 and Mullins et al. y 
Hypertension, 1993, 22(4): 630-633). Thus, success of the transgenic mouse, as is demonstrated in 
this application, is highly predictive of success in the transgenic rat. 

{W:\00630\1 00D532US1 \00509 1 02 .DOC IlIIIIlllIIlHIinilDD }Error! Unknown document property name. 



to a level .sufficient -to cause l^ype II <&0agen degradation m the joints. In my sixteen -.ymm of 
a transgenic mouse in^ ^ 

several ^nsg^ For ie^ntpfe; weM^e $ti^ i 

12. I fiii&faer declare that a!! statements made herein of my own taiowlecl go are true, 
anditlsi& ali^ be true, I Mrtlie& declare 

-iftat^ thatthe a^B^ 

^ pr\itetihi:-iin of fitle :|8 of the 

Ujs^ false statements may jeopatxlixe the validity of die 

iffitim t/app JteMon' or M toy patetit is^iied tfcen^pdri. 



Res^ectRiiiy submitted* 




Curriculum Vitae 



G. Roger Askew 

Business Address: Home Address: 

Wyeth Research 1 1 High Ridge Road 

One Burtt Road Boxford, MA 01921 

Andover, MA 01810 978-887-1351 
Phone 978-247-2690 
FAX 978-247-2580 

Education: 

1989 Ph.D., Department of Molecular Biology and Biochemistry, Wesleyan University, 

Middletown, CT. 

1980 B.S., Biology, University of Connecticut, Storrs, CT. 

Positions: 

2003 - current Associate Director, Molecular Genetics, Department of Applied Genomics, Wyeth 
Research, Andover, MA 

2001- 2003 Principle Scientist I, Department of Molecular Genetics, Wyeth Research, Andover, 
MA 

1999-2001 Senior Research Scientist II, Department of Molecular Genetics, Wyeth Research, 
Andover, MA 

1996- 1999 Senior Research Scientist, Department of Molecular Genetics, Wyeth- Ayerst Research, 
Princeton, NJ 

1994 - 1996 Senior Scientist, Department of Molecular Genetics, Wyeth- Ayerst Research, 
Princeton, NJ 

1 992- 1 994 Research Associate, Program of Excellence in Molecular Biology of Heart and Lung, 
Department of Molecular Genetics, Biochemistry and Microbiology, 
University of Cincinnati College of Medicine, Cincinnati, OH. 

1989-92 Postdoctoral Assistant, Dr. Jerry B Lingrel, Program of Excellence in Molecular 
Biology of Heart and Lung, Department of Molecular Genetics, Biochemistry and 
Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH. 



1984-89 



Graduate student, Department of Molecular Biology and Biochemistry, 
Wesleyan University, Middletown, CT. 



Invited Seminars: 

2002 "Evaluation of a conditional knock out gene trapping strategy " 

Second International Gene Workshop, 
University of Frankfurt Medical School Frankfurt, Germany 

2001 "Site directed transgenesis by recombinase driven insertion" 

First International Gene Trap Workshop, Mount Sinai Hospital, Toronto, Canada 

1999 "Phenotypic analysis of Estrogen Receptor-deficient mice" 

Brekenridge Workshop on Steroid Receptors in Brain Function, Brekenridge, Colorado 

1 994 "Interaction of the Cardiac Glycoside Receptor with it's Ligand" 

Program of Excellence Meeting (MIT, UCinci., UCSF), Santa Barbara, CA 

1993 "Site directed Point Mutations in Embryonic Stem Cells: a Gene Targeting Tag-and-Exchange 

Strategy" 

Department of Molecular Genetics, University of Cincinnati, Cincinnati,OH 

1993 "Site directed Point Mutations in Embryonic Stem Cells: a Gene Targeting Tag-and-Exchange 

Strategy" 

Division of Nephrology, Indiana University Medical School, Indianapolis, IN 

1993 "Targeting Point Mutations in Embryonic Stems Cells by Sequential replacement" 

Institute for Developmental Biology, Children's Hospital, Cincinnati, OH 

Honors and Awards: 

1992 American Heart Association , Ohio Affiliate Grant-In- Aid. 

1 988 Sigma Xi grant in aid of research award. 

1988 Peterson Prize for excellence in biochemistry. 

1987 Elected to membership in Sigma Xi. 

1 986 Sigma Xi grant in aid of research award. 

1984-89 Full Graduate Tuition Fellowship, Wesleyan University. 

Publications: 

Glasson SS. Askew R, Sheppard B. Carito BA. Blanchet T. Ma HL. Flannery CR. Peluso D. Kanki K. 
Yang Z. Majumdar MK. Morris EA. Deletion of active ADAMTS5 prevents cartilage degradation in a 
murine model of osteoarthritis. Nature 434(7033):644-8, 2005 Mar . 

Glasson SS. Askew R. Sheppard B. Carito BA. Blanchet T. Ma HL. Flannery CR. Kanki K. Wang E. 
Peluso D. Yang Z. Majumdar MK. Morris EA. Characterization of and osteoarthritis susceptibility in 
ADAMTS-4-knockout mice. Arthritis & Rheumatism. 50(8):2547-58, 2004 Aug 

Shughrue PJ, Askew GR, Dellovade TL, and Merchenthaler I, Estrogen-binding sites and their 
functional capacity in estrogen receptor double knockout mouse brain. Endocrinology 143:1643- 
1650, 2002. 

James PF, Grupp IL, Grupp G, Woo A, Askew GR, Croyle ML, Walsh RA and Lingrel L. 
Analysis of mice heterozygous for null mutations of the Na,K-ATPase al and a2 isoform genes 



identifies a specific role for the a2 isoform as regulator of calcium in the heart. Molecular Cell. 
3(5):555-63, 1999 May 

Babij P, Askew GR, Niewenhuisjsen B 5 Su CM, Bridal TR, Jow B, Argentieri TM, Kulik J, Degennaro 
LJ, Spinelli W and Colatsky TJ. Inhibition of cardiac delayed rectifier K+ current by over expression of 
the Long Q-T syndrome HERG G628S mutation in transgenic mice. Circulation Research 83:668-678, 
1998. 

Shughrue P, Scrimo P Malcolm L, Askew GR, and Merchanthaler I. The distribution of estrogen receptor- 
p mRNA in forebrain regions of the estrogen receptor-oc knock out mouse. Endocrinology 138:12, 5649- 
5652 1997. 

Sato A, Askew GR, Hein, J., Masaki H., and Yatani A: Modulation of Na + ,K + -pump function by 

mutations in the first transmembrane region of the Na + ,K + - ATPase ocl subunit. Am J Physiol 270 (Cell 
Physiol. 39): C457-C464, 1996. 

Crump RG, Askew GR, Wert S, Lingrel, JB, and Joiner CH: In situ localization of sodium-potassium 
ATPase mRNA in developing mouse lung epithelium. Am J Physiol 269 (Lung Cell. Mol. Physiol. 13): 
L299-L308, 1995. 

Linn SC, Askew GR, Menon AG, and Shull GE: Conservation of an AE3 C17HC03" exchanger cardiac- 
specific exon and promoter region and AE3 mRNA expression patterns in murine and human hearts. 
Circulation Research 76:584-591, 1995. 

Askew GR, and Lingrel JB: The amino acid substitution C111Y in human Dl Na,K- ATPase confers 
differential resistance to structurally related cardiac glycosides.. J Biol Chem 269:24120-24126, 1994. 

Askew GR, Lingrel JB, Grupp I, and Grupp G: Direct correlation of NKA-Isoform abundance and 
myocardial contractility in mouse heart. In: Bamberg E, Schoner W, editors. The Sodium Pump. Springer, 
New York:Darnstatd and Steinkopff :7 18-721, 1994. 

Lingrel JB, Van Huyesse J, Jewell-Motz B, Schultheis P, Wallick ET, O'Brien W, and Askew GR: Na,K- 
ATPase: Cardiac Glycoside Binding and Functional Importance of Negatively Charged Amino Acids of 
Transmembrane Regions. Kidney International, 1994 

Lingrel JB, Van Huysse J, Obrien W, Jewel-Motz E, Askew GR, and Schultheis P: Structure- function 
studies of the Na,K-ATPase. In: Bamberg E, Schoner W, editors. The Sodium Pump. Springer, New 
York:Darnstatd and Steinkopff :276-286, 1994. 

Askew GR, Doetschman T and Lingrel JB: Site-directed point mutations in embryonic stem cells: a gene 
targeting tag-and-exchange strategy. Mol Cell Biol 1 3 :4 1 1 5-4 1 24, 1 993 . 

Beck K, Seekamp A, Askew GR, Zhu M, Farrell C, Wang S, and Lukens L: Association of a change in 
chromatin structure with a tissue-specific switch in transcription start sites in the a2 (I) collagen gene. 
Nuc Acid Res 19:4975-4982, 1991. 



Askew GR, Wang S, and Lukens L: Different levels of regulation accomplish the switch from type II to 
type I collagen gene expression in S-Bromo^'-deoxyuridine-treated chondrocytes. J Biol Chem 
266:16834-16841, 1991. 

Abstracts: 

BJ. Sheppard l , S. S. Glasson 2 , L. Block 1 , R. Askew 3 , T. Blanchet 2 , M. Leach 1 , and E. A. 
Morris 2 , Characterization of ADAMTS4 knock out mice, American College of Vetranarian 
Pathologists, 2004 

Glasson, SS; Blanchet, TJ; Carito, BA; Tavares, JL; Peluso, D; Askew, R; Kanki, K; Morris, EA. 
Osteoarthritis in Aggrecanase-I knockout and wild-type mice is comparable following surgical instability. 
Orthopedic Research Society, 2003 

Glasson, SS; Blanchet, TJ; Carito, BA; Tavares, JL; Peluso, D; Askew, R; Kanki, K; Morris, EA. 
Aggrecanse-2 is critical for osteoarthritis progression in a surgical model. Orthopedic Research Society, 
2003 

Yogendra Kharode, Paula Green, James Marzolf, Weiguang Zhao, Roger Askew, Paul Yavorsky and 
Frederick Bex. Alteration in bone density of mice due to heterozygous inactivation of LRP6. American 
Society for Bone and Mineral Research, 25 th annual, 2003 

Kwak SP, Barton M, Monaghan M.M, Askew GR, Doliveira LC, Comery TA, Kulik J, Degennaro 
L, Marquis KL, and Rhodes KJ. 

Creation and characterization of kvb1.1 knock-in mice lacking the n-terminus necessary for 
rapid inactivation. Neuroscience, 2001 

Askew GR, Grupp I, Grupp G, Lingrel L, Slack J, and Tosun, M: Direct correlation of NKA-Isoform 
abundance and myocardial contractility in mouse heart. Biol Chem. Hoppe-Seyler 374:563A, 1993. 
Askew GR, and Lingrel JB: Identification of a residue at the cardiac glycoside binding site of Na,K- 
ATPase. Biol Chem. Hoppe-Seyler 374:605 A, 1993. 

Lingrel JB, Schultheis P, Jewell-Motz EA, Van Huyesse J, Askew GR, O'Brien W, Kuntzweiler TA, and 
Wallick ET: Studies of the cardiac glycoside and cation binding sites of Na,K- ATPase using site-directed 
mutagenesis. Biol Chem. Hoppe-Seyler 374:5545 A, 1993. 

Crump RG, Askew GR, Wert S, and Joiner CH: Perinatal development of Na,K- ATPase a and p subunit 
mRNAs in mouse lung. Ped Res 33:45A,1993. 

Crump RG, Joiner CH, Askew GR, and Wert S: Ontogeny of Na,K- ATPase ctl isoform mRNA in mouse 
lung epithelium during the perinatal period. The Physiologist 35: 20A, 1992. 

Askew GR and Lingrel JB: Gene modification of the cc2 Na,K-ATPase gene in murine embryonic stem 
cells. Presented at the Gordon Conference on Molecular Genetics, 1991. 

Patents: 

Conditional knockout method for gene trapping and gene targeting using an inducible gene 
silencer 

R Askew, M Barton and K Kanki. US Patent application (filed May 2003). 



Case number AMI 00651 

Knock in transgenic mammal containing a non-functional n-terminus of k v beta 1.1 subunit 

This application claims priority from a copending provisional application serial number 60/308,485, filed 

on July 27, 2001, the entire disclosure of which is hereby incorporated by reference. 



The Journal of Biological Chemistry 

© 1997 by The American Society for Biochemistry and Molecular Biology, Inc. 



Vol. 272, No. 17, Issue of April 25, pp. 11566-11574, 1997 

Printed in U.S.A. 



Structural and Functional Analysis of the Chick Chondroitin 
Sulfate Proteoglycan (Aggrecan) Promoter and Enhancer Region* 

(Received for publication, September 24, 1996, and in revised form, February 5, 1997) 

Edward W. Pirok III$§, Hao Li§H||, James R. Mensch, Jr.§, Judith Henry§, 
and Nancy B. Schwartz§H**tt 

From the Departments of %Pathology, ^Pediatrics, and Wiochemistry and Molecular Biology, 
and **Committee on Developmental Biology, University of Chicago, Chicago, Illinois 60637 



Aggrecan is a large chondroitin sulfate proteoglycan, 
the expression of which is both tissue-specific and de- 
velopmentally regulated. Here we report the cloning 
and sequencing of the 1.8-kilobase genomic 5' flanking 
sequence of the chick aggrecan gene and provide a func- 
tional and structural characterization of its promoter 
and enhancer region. Sequence analysis reveals poten- 
tial Spl, AP2, and NF-I related sites, as well as several 
putative transcription factor binding sites, including 
the cartilage-associated silencers CIIS1 and CIIS2. A 
number of these transcription factor binding motifs are 
embedded in a sequence flanked by prominent inverted 
repeats. Although lacking a classic TATA box, there are 
two instances in the 1.8-kb genomic fragment of TATA- 
like TCTAA sequences, as have been defined previously 
in other promoter regions. Primer extension and SI pro- 
tection analyses reveal three major transcription start 
sites, also located between the inverted repeats. Tran- 
sient transfections of chick sternal chondrocytes and 
fibroblasts with reporter plasmids bearing progres- 
sively reduced portions of the aggrecan promoter region 
allowed mapping of chondrocyte-specific transcription 
enhancer and silencer elements that are consistent with 
the sequence analysis. These findings suggest the impor- 
tance of this regulatory region in the tissue-specific ex- 
pression of the chick aggrecan gene. 



During development, the extracellular matrix is a complex 
dynamic structure, the components and organization of which 
help to establish the requisite position and state of differenti- 
ation. The large chondroitin sulfate proteoglycan (CSPG), 1 ag- 
grecan, has been localized predominantly to skeletal tissue and 
is considered to be a hallmark of cartilage differentiation. In 
chick cartilage, aggrecan expression begins at embryonic day 5 
in limb rudiments, continues through the entire period of chon- 
drocyte development, and remains a biochemical marker of the 
cartilage phenotype thereafter. In very early embryos, aggre- 



* This work was supported by U. S. Public Health Service Grants 
AR-19622 and HD-09402, a grant from the Brain Research Foundation 
(to N. B. S.), Training Grant HL-07237 (to H. L.), and a Fellowship from 
the Markey Program in Molecular Medicine (to E. W. P.). The costs of 
publication of this article were defrayed in part by the payment of page 
charges. This article must therefore be hereby marked "advertisement" 
in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 

The nucleotide sequence(s) reported in this paper has been submitted 
to the GenBank™/EBI Data Bank with accession number(s) U83593. 

II Deceased. 

t$ To whom correspondence should be addressed: The University of 
Chicago, 5841 S. Maryland Ave., MC 5058, Chicago, IL 60637. Phone: 
312-702-6426; Fax: 312-702-9234. 

1 The abbreviations used are: CSPG, chondroitin sulfate proteogly- 
can; nt, nucleotide(s); kb, kilobase(s); bp, base pair(s); SP, signal pep- 
tide; PCR, polymerase chain reaction. 



can is expressed in the notochord as early as stage 16, long 
before chondrogenesis occurs (1). 

We have extensively studied the properties and expression of 
aggrecan from embryonic chick cartilage. These studies include 
synthesis and processing (2-5), structural analysis via peptide 
sequencing to elucidate glycosylation motifs, and a consensus 
sequence for O-xylosylation and mapping of the S103L mono- 
clonal antibody epitope (6-10). Moreover, we have conducted 
molecular analysis to construct the composite sequence of chick 
cartilage CSPG from overlapping cDNAs and to identify a 
defect in the aggrecan gene associated with the chondrodystro- 
phy, nanomelia (9, 11). 

This sequence, obtained from 10-day-old chick embryos, has 
6464 nucleotides that include an open reading frame encoding 
2109 amino acids and 16 nucleotides of the first untranslated 
exon (11). Another chick aggrecan cDNA sequence, obtained 
from embryonic chick brain, was 6597 nt in length, including 
265 nt of 5 '-untranslated exon sequence (12). Using chick 
CSPG cDNA probes, we subsequently isolated genomic clones 
containing exons encoding the chick CSPG core protein. The 
two 5' globular domains, Gl and G2, are encoded by four and 
three exons, respectively, and the interglobular domain is en- 
coded by a single exon. The chondroitin sulfate attachment 
domain is encoded by the largest exon, 3216 bp, which is 
approximately 50% of the total coding sequence. These data 
reveal that the chick CSPG gene contains at least 18 exons 
spanning more than 30 kb. No evidence was obtained for mul- 
tiple genes for aggrecan in the chick genome. Elucidation of the 
genomic organization of chick aggrecan has allowed for a more 
thorough comparison with the mammalian aggrecans, as well 
as the avian and mammalian link proteins, with respect to 
origin and mechanisms of divergence. A summary of this work 
was published recently (13). 

We have also found that aggrecan is developmentally ex- 
pressed, in ovo and in limb bud cultures, on both protein and 
mRNA levels in a pattern commensurate with the onset of 
chondrogenesis. The modulation of expression of this cartilage- 
specific CSPG and type II collagen mRNA in stage 24 limb bud 
mesenchyme cells cultured in high density was examined un- 
der conditions that promote chondrogenesis in vitro (14) and 
mimics the same process in limb development in ovo. Morpho- 
logically, mesenchymal proliferation ceases by day 2, conden- 
sation occurs first in the formation of aggregates by days 4-5, 
and then of overt nodules by days 6-8, concomitant with cel- 
lular differentiation and production of matrix. Quantitatively, 
a 50-fold increase in aggrecan mRNA occurs from day 2 (when 
first detected) to day 6, followed by a slight decline (about 
2-fold) by day 8 when the message reaches a plateau thereafter 
(15). This same pattern is observed immunologically, using the 
monoclonal antibody S103L, which is specific to the aggrecan 
protein. These studies indicate that during limb development 

This paper is available on line at http://www-jbc.stanford.edu/jbc/ 



11566 



Analysis of the Chick Aggrecan Promoter Region 



the expression of these two differentiation-specific proteins are 
stringently controlled until the establishment of the cartilage 
phenotype. Thereafter, aggrecan continues to be synthesized 
and deposited in the extracellular matrix, perhaps to effect a 
decrease in cell adhesion necessary for maintenance of the 
chondrogenic state. 

Concurrent with studies of mechanisms that control the tem- 
poral-spatial aspects of cartilage differentiation are structural 
and functional analyses of expression of the differentiation- 
specific products of the extracellular matrix. For instance, sig- 
nificant work has been done to understand the tissue-specific 
expression of collagen genes and the mechanisms that regulate 
their distinct transcriptional programs (16-18). In contrast, 
there have been no studies of the transcriptional regulation of 
the aggrecan gene that examine its tissue-specific expression 
during development. Mouse aggrecan has been cloned; how- 
ever, no functional analysis has been performed to examine its 
tissue specificity (19). A preliminary characterization of the rat 
aggrecan promoter has also appeared, describing a 120-bp se- 
quence containing transcription start sites (20). It is not clear 
whether this 120-bp genomic fragment contains tissue-specific 
control elements, because the 5' promoter/enhancer region is 
probably larger or may contain additional regulatory elements. 
The same report described promoter assays on a larger isolate 
containing an additional 520 bp of 5' flanking sequence, but the 
sequence data were not presented. 

Therefore, to begin to elucidate the mechanisms that govern 
aggrecan expression in chondrocytes, we have cloned the pro- 
moter region of the embryonic chick S103L-reactive CSPG (ag- 
grecan). The aim of the present study was to identify and 
characterize the cell- and stage-specific elements in the 5' 
genomic flanking region of the aggrecan gene, which could 
regulate the expression of this extracellular macromolecule 
during embryonic development. 

EXPERIMENTAL PROCEDURES 

Materials — Oligonucleotides were made with an Applied Biosystems 
3808 DNA synthesizer. Reagents for biochemical and molecular cloning 
experiments were of the highest quality available from commercial 
vendors. Restriction endonucleases were from New England Biolabs 
unless otherwise stated. T4 DNA ligase, T4 kinase, Si nuclease, avian 
myeloblastosis virus reverse transcriptase, and Klenow polymerase 
were from Promega. Taq polymerase was from Perkin-Elmer. A chick 
genomic library was purchased from CLONTECH Laboratories. 

Preparation of Probe and Screening of Chick Genomic Library — A 
chick aggrecan cDNA fragment comprising 260 bp of the 5 '-untrans- 
lated exon plus 56 bp of the signal peptide (SP) exon was obtained via 
PCR from the previously reported cDNA, clone 1 (11). Because the 
template clone was inserted in pGEM-4Z, the upstream primer was the 
SP6 promoter primer (Promega); the downstream primer was a 17-mer, 
5'-CTGTGGTGATGGCTTGC-3 ', from the antisense strand of the SP 
exon. The probe was then purified by low-melting-point agarose gel 
electrophoresis and labeled with 32 P using a Multiprime DNA labeling 
system and [a-^ldCTP purchased from Amersham Corp. Approxi- 
mately 50,000 independent members of the chick genomic library were 
screened. The chick genomic library was plated, and nitrocellulose 
plaque-lifts were prepared and probed by hybridization according to 
standard methods (21). Positive plaques were picked, then re-plated, and 
screened as above for two or three rounds until the plaques were purified. 

Isolation of Chick Aggrecan Genomic Clones — The screening yielded 
a 14-kb genomic fragment (Fig. IB). Phage DNA was purified from plate 
lysates (21). Isolates from the library screening were subcloned into the 
vector pGEM-4Z by standard methods (21). Southern blot analysis 
using the same aggrecan untranslated exon probe identified an approx- 
imately 1.8-kb Bglll-Bbsl genomic fragment that was subcloned into 
pGEM-4Z. Initial sequencing with the T7 promoter primer (Promega) 
revealed that one end of the subclone had a sequence identical to the 5' 
145 bp of a previously published S103L-CSPG cDNA sequence (12), 
with the exception of three dA residues that were not present in the 
cDNA sequence. The genomic clone has a tract of 21 dAs where the 
cDNA has a stretch of 18 dAs. This likely reflects an error arising 
during library generation because the flanking sequences are identical. 



11567 

The 1.8-kb insert was excised from pGEM-4Z by Eco&l-Kpnl digestion, 
treated with Klenow polymerase, and blunt-end ligated into the re- 
porter vector pGL2-Basic (Promega), which had been linearized with 
the restriction enzyme Nhel and treated with Klenow. The reporter 
vector pGL2 -Basic does not contain any eukaryotic promoter or en- 
hancer elements. Sequences to be assayed for promoter activity are 
inserted upstream (5') of a luciferase gene. Plasmids were sequenced to 
find clones that had the insert positioned in the forward (+) and reverse 
(-) orientations (Fig. 2C). The forward orientation was defined as 
having the 1.8-kb insert ligated into the reporter vector pGL2-Basic 
with the same 5 '-3' orientation relative to the reporter gene as the 
native sequence in the genomic clone relative to the aggrecan gene. 
Constructs that contained the 1.8-kb genomic insert of the chick aggre- 
can gene were named Ag-1(+) and Ag-1(— ). 

Sequence Determination and Analysis — Dideoxynucleotide chain ter- 
mination sequencing (22) of the BgllVBbsl DNA fragments subcloned 
into pGEM-4Z plasmids was performed using the U. S. Biochemical 
Sequenase (version 2.0) system. Primers were T7 or SP6 promoter 
primers (Promega) or 18-20-mer oligonucleotides synthesized accord- 
ing to the obtained sequence. Multiple sequence determinations were 
made for each primer used. Ambiguities in sequencing were resolved by 
using a different polymerase (e.g. avian myeloblastosis virus reverse 
transcriptase), sequencing the complementary strand, or both. All res- 
idues were confirmed by at least two separate sequence determinations. 
DNA sequence analysis was performed using the Wisconsin Package 
(23). Searching for palindromic sequences was done using the program 
COMPARE to find inverted repeats by comparing the sequence to its 
own complement (24), and the results were displayed via the program 
DOTPLOT. Putative transcription factor binding sites were located 
with the program FINDPATTERNS using the pattern file tfsite.dat, 
which comprises the Transcription Factor Database (25). 

Purification of DNA — Plated colonies were used to inoculate 5 ml of 
LB medium (21). The cells were grown overnight at 37 °C with vigorous 
shaking. The 5-ml culture was added to 400 ml of LB. The culture was 
shaken at 37 °C for at least 12 h, cells were harvested, and plasmid 
DNA was recovered using the QIAGEN Plasmid Maxiprep kit. 

Synthesis of Deletion Constructs — The inserts for plasmid constructs 
1300(+), 900(+), 500(+), and 500(-) were made by PCR using the 
Ag-1(+) construct as a template (Fig. 2, A and B, and Fig. 6B). Xhol 
sites were introduced at the end of the amplified fragments via the 
primers used. PCR fragments were purified using Qiaquick PCR Preps 
(QIAGEN) and digested with Xhol for 2 h. The fragments were gel 
purified and ligated into the Xhol site of the pGL2-Basic vector. Inserts 
A(+) to F(+) were made via PCR with Ag-1(+) as a template, and the 
primer oligonucleotides contained downstream BgtWSmal and up- 
stream Kpnl restriction enzyme cutting sites. The PCR fragments were 
gel purified, digested with BglU and Kpnl, and ligated directly into 
pGL2-Basic, producing the constructs A(+) to F(+) (Fig. 2, A and£, and 
Fig. 6B). The constructs A(-) to F(~) were made in the same fashion as 
above, except that each insert was digested with Smal and Kpnl at the 
insert ends to ensure their opposite orientation in the pGL2-Basic 
vector relative to the A(+) to F(+) inserts (Fig. 2, A and B, and Fig. 6B). 
Sequencing of the various constructs was done to confirm the appropri- 
ate orientation of the inserts and exclude PCR artifacts. 

Cell Cultures — Cultures of day-14 chick sternal chondrocytes were 
established according to the procedures described by Cahn et al. (26) 
and as modified by Campbell and Schwartz (3). Cultures of fibroblasts 
were established from skin of day- 10 chick embryos following 
trypsinization (3). Cells were plated at an initial density of 1.5 X 
10 6 /100-mm tissue culture dishes (Falcon) in either F-12 medium (chon- 
drocytes) or Dulbecco's modified Eagle's medium (fibroblasts) and sup- 
plemented with 10% fetal calf serum. The cells were permitted to attach 
to the dishes, and subsequent growth (2-3 days) was maintained by a 
complete change of the medium every 2 days (2). On the day of trans- 
fection, chondrocyte cultures were trypsinized, and single cells were 
suspended in F-12 medium, replated, and allowed to attach to the 
dishes for 3-4 h before treatment as described below. 

Transfection — Standard methods were followed for transient calcium 
phosphate transfections (21). Duplicate plates containing approxi- 
mately 5 x 10 6 cells (either chondrocytes or fibroblasts) received 20 
pmol of a given plasmid construct to be assayed. Five jig of a 0-galac- 
tosidase reporter plasmid were cotransfected with each experimental 
construct to correct for cell loss. Duplicate transfection sets were repeated 
three times, each time yielding similar results. The transfections were 
allowed to proceed for 36 h. The relative efficiency of transfecting the 
chondrocytes was approximately 13% that of transfecting the fibroblasts. 

Cell Recovery and Assays — Reagents for the luciferase and 0-galac- 
tosidase assays were purchased from Promega. Because both luciferase 



11568* 



Analysis of the Chick Aggrecan Promoter Region 



exonsize 278 77 372 175 12s 297 513 175 12a 294 t9s 3216 114 159 sa us 183 >20i 

intron size >ew> sas sie 252 630 2012 549 sbo 100 460 1 kb 7-8 kb 2127 2t8 332 867 801 

intron number 12 3 4 5 6 739 10 11 12 13 14 is ts 17 

exon number 
gene structure 



B 




14S bp ktentity wtth axon 1 



pGEM-4Z 

pGL2-Basic Ag-1(+) 
pGL2-Basic Ag-1(-) 



Fig. 1. Chick aggrecan genomic structure and cloning strategy. A, a schematic diagram of the genomic structure of the chick aggrecan 
gene. The names and sizes of the introns or exons are printed above the diagram. Exons are represented as open boxes, except the first untranslated 
exon, which is represented as a checkered box. Introns are represented as the lines connecting the boxes. The diagram is not to scale. B, the cloning 
strategy for the chick aggrecan promoter region. On the right side of the diagram is a chart that indicates the size, vector, and name used for each 
construct. As described under "Experimental Procedures," cDNA from the untranslated and signal peptide exons was used to screen a chick 
genomic library. The 14-kb genomic fragment obtained was subcloned into the vector pGEM-4Z and is represented as a black rectangle with the 
checkered pattern indicating the region of overlap with the first untranslated exon. The 14-kb fragment was digested with Bglll and Bbsl, and the 
resultant 1.8-kb fragment was subcloned into the sequencing vector pGEM-4Z and the luciferase reporter vector pGL2-Basic, which does not 
contain a eukaryotic promoter region. Each orientation of the genomic inserts was confirmed by sequencing. 



assays and /J-galactosidase assays were performed, Promega's Reporter 
Lysis Buffer (RBL, E3971) was used to prevent the inhibition of 0-ga- 
lactosidase activity that occurs in buffers containing detergents such as 
Triton X-100. No deviations were made from the manufacturer's proto- 
col for preparation of extracts from tissue culture cells. The enzymatic 
activity of luciferase was measured with a luminometer (Analytical 
Luminescence Laboratory, Monolight 1500). The enzymatic activity for 
j3-galactosidase was measured with a microplate reader (Dynatech) at 
409 nm. Standard deviations were determined for the six assays per- 
formed on duplicate plates within one experiment. 

End-labeling of Probes for mRNA 5' End Mapping— The Z2 or Z3 
oligonucleotides (Z2, 5'-AATTCCCTGTGTGGTATTTCAGGTCCTT- 
TCAGGC-3', nt 193-226; Z3, 5'-GCAAGAGAGACCATCAAACTCCT- 
GTCAGCCTCCT-3', nt 68-101) for primer extension experiments or Si 
analysis were end labeled using [-^ 32 P]ATP and T4 DNA kinase according 
to standard protocols (21). Three ethanol precipitations were performed to 
remove the residual [t- 32 PJATP from the labeled oligonucleotides. 

SI Analysis of mRNA Using Single-stranded DNA Probes— Estab- 
lished methods were used to perform SI analysis (27). Single-stranded 
probes were made from the double-stranded 900(+) and D(-f ) plasmids. 
Plasmids were alkali-denatured, and a 32 P-5' -end-labeled oligonucleo- 
tide primer, Z2 or Z3, was annealed to the template, 900(+) or D(+), 
and extended with Klenow (Promega). Probes were cut to the appropri- 
ate 5' length by digestion with restriction enzyme Kpnl. The single- 
stranded probes were separated from the template DNA by alkaline 
low-melting-point agarose electrophoresis, and radiolabeled bands were 
cut out and purified by phenol extraction and ethanol precipitation (21). 
Approximately 5000 cpm of probe was hybridized to 25 ^ig of total RNA 
from day-14 chick sternal chondrocytes. The hybridization occurred at 
55 °C for 12 h in an aqueous hybridization solution (21). The resultant 
RNArDNA hybrid was digested with 200 units of Si nuclease for 60 
min. The products were electrophoresed in 6% poly acryl amide sequenc- 
ing gels. 

Primer Extension — Approximately 5000 cpm of labeled Z2 or Z3 
probe was hybridized to 25 jig of RNA derived from day-1 chick sternal 
chondrocytes. Hybridization was done in SI hybridization solution for 
12 h at 30 °C (21). Extended products were produced by treating the 
hybrid RNArprimer with 40 units of avian myeloblastosis virus reverse 
transcriptase (Promega). Products were extracted in phenol/chloroform, 
precipitated in ethanol, and electrophoresed on 6% polyacrylamide se- 
quencing gels. 

RESULTS 

Structural Analysis of the 5' Portion of the Chick Aggrecan 
Gene— To guide functional studies, the complete 1.8-kb Ag-1 
sequence was determined and found to comprise 1875 bp (Fig. 



3). Examination of the sequence revealed the lack of a classical 
TATA box or CCAAT box. When the Ag-1 fragment was ana- 
lyzed for transcription factor binding sequences, it was found 
that at least 202 potential sites were present, including puta- 
tive AP2 and Spl binding sites. The relative positions of some 
of these eukaryotic transcription factor-associated sequences 
are indicated in Fig. 3. The numbering of the sequence is 
relative to the most upstream transcription start site (as de- 
tailed below). The Ag-1 sequence was also compared with 
known promoter sequences in the eukaryotic promoter data 
base (EPD) using the National Center for Biotechnology Infor- 
mation BLAST server (25), and no extensive identity with 
other promoter sequences was found. However, tracts of mul- 
tiple dA and dT residues, analogous to those found in Ag-1 in 
the ranges 250 to 280 and - 144 to -78, respectively, were seen 
to occur in many other described promoter regions. These dA 
and dT tracts, in particular the dT 16 from -87 to -78 and the 
dA 21 from 250 to 270, constitute an inverse repeat or palin- 
drome with the potential to give rise to a pair of large stem- 
and-loop structures or a cruciform structure (28). Hence, addi- 
tional analyses were performed on the Ag-1 sequence with the 
aim of detecting other, less obvious, palindromic sequences. 

The Ag-1 sequence from positions -300 to 340 was analyzed 
by comparison to its own reverse complement sequence with 
the Wisconsin Package program COMPARE. The dot plot re- 
veals a widely spaced pair of inverted repeats centered around 
— 100 and 250, corresponding to the dT and dA tracts, sepa- 
rated by over 300 bp. However, no other potential secondary 
structures of comparable scale are seen in this sequence with 
the window/stringency parameters used in this analysis; a few 
less prominent repeat pairs occur in the downstream third of 
the sequence. Interestingly, the putative Spl, AP2, and TFII 
sites in addition to other potential factor-specific sequences, as 
well as all three of the mapped start sites, lie in the putative 
loop portion of this potential structure. Such secondary struc- 
tures, in addition to potential transcription factor binding sites, 
may be involved in mechanisms by which the aggrecan mes- 
sage is developmentally regulated. 

Determination of Transcription Starting Sites — Two meth- 
ods, SI analysis and primer extension, were used to locate 



Analysis of the Chick Aggreean Promoter Region 



11569 




B 



145 bp identity with exon 1 



pr1300. 



pr900 _ 



pr50O «_ 
prA — 
prB " 



prD" 



Primer Sequence and Name 



prE 



Si£& Engineered Rogftictjon Sites 



pr1300 






A AG CGC TCG AGA TOG CTG GAT GAA AAGCAG 


30mer 


Xho 1 


pi900 






ACT CTC GAG OGCTCTTAA GTCCTTCTA CAA 


30mer 


Xho 1 


pr500 






TAA CTC GAG CTT CTCTCT CAA CCA CTT GT 


29mer 


Xho 1 


prA 






GGG GTA CCC CAT AAA CCGTGCTCT CTT 


27mer 


Kpn 1 


prB 






GGG GTA CCC CTC AAC CAC TTG TOG TGC 


27mer 


Kpnl 


prD 






GGG GTA CCC CAA GAG OCT COS CTT OCT 


27mer 


Kpn t 


prE 






GGG GTA OCT TTA CTC TAA AAT CCA G 


25mer 


Kpn 1 


prF 






GGG GTA CCC CTT CAA GCC TGC TGC TGC 


27mer 


Kpn 1 


prZI 






GGA ATT CCT CGA GAG A AG AAA TCA CAA TTC OCT 


33mer 


EcoRI, Xho 1 


prZO 






TCC CCC GGG GGA AGA TCT TCA AGCTTA GTT AGA TCT C 


37mer 


Sma t, Bgl II 



► 


Coordinates 


Size 


Construct 


^ prZ1 


-1038, +238 


1276 bp 


1300(+) 


^ prZ1 


-638, +238 


876 bp 


900{+) 


«^ prZI 


-247, +238 


485 bp 


500{+/-> 


" - *1>rZ0 


-283. +307 


590 bp 


A<+/-) 


^-prZO 


-240, +307 


547 bp 


B(+/~) 


■^""j prZO 


-69, +307 


376 bp 


0(4/-) 


~~ ► ^~"prZ0 


-1, +307 


308 bp 


W-) 


prF ^~~prZO 


+168, +307 


140 bp 


R+/-) 



Sjna Kpnl XT|Ol Bp) II 



Luciferase Gene 



- | LucHerase Gene | — 



Insert (+) jggg 



Insert (-) 



Luciferase Gene 



Luciferase Gene 



Fig. 2. Schematic representation of the pGL2-Basic vector and the primers used to synthesize the deletion constructs. A, the 1.8 

genomic fragment Ag-1. The checkered pattern represents the 145-bp identity with the aggreean cDNA. Above the diagram are the restriction enzymes 
used to excise this genomic fragment from the clone G8. Below are shown the names and relative positions of the primers used to make the deletion 
constructs. To the right side of the diagram is a chart that indicates the coordinates that the primer pairs span, the size of the resultant PCR product, 
and the names of the constructs. B, below the diagram is a list of all of the primers used to make deletion constructs with the Ag-1 sequence as a 
template. The boldface represents sequences not found in Ag-1 but added to engineer restriction enzyme cutting sites. Note that prZO resides in the 
vector pGL2-Basic and not in the Ag-1 sequence. C, a schematic representation of the pGL2-Basic vector and the relative positions of the restriction 
sites used in these experiments. The arrow represents the direction of transcription of the vector. The (+) and (— ) orientations are defined as the 
positionings of the insert with respect to the luciferase gene in the same or reversed ways as it occurs with respect to the aggreean coding sequence. 



the sites where transcription of the aggreean mRNA is initi- 
ated. Because the 5 '-untranslated cDNA sequence previously 
reported by this laboratory (11, 12) overlaps with the 3' end 
of the Ag-1 genomic isolate by 145 nucleotides, transcription 
initiation occurs still farther upstream in Ag-1. Templates 
used to generate single-stranded DNA probes for SI analysis 
included the 900(4- ) and D( + ) plasmid constructs, as repre- 
sented in Fig. 4C. SI analysis with the downstream primer 
Z2 yielded three major protected fragments: 226 bp, 187 bp, 
and a 69/70-bp doublet, corresponding to start sites at posi- 
tions 1, 40, and 157-158 (Fig. 4A, lanes 1 and 2). Position 1 in 
Fig. 3 is defined as the farthest 5' transcription starting site. 
These locations were obtained with probes generated from 
both the 900(+) and the D(+) constructs. The two upstream 
transcription start sites at positions 1 and 40 were confirmed 
with the downstream primer Z3-generated probes, again us- 
ing the 900(+) and D(+) constructs as DNA templates (Fig. 
4B, lanes 4 and 5). Z3-generated probes from the 900(+) and 
D( + ) constructs gave protected fragments of 101 and 62 bp, 
respectively, confirming the position 1 and 40 transcription 
starting sites. The Z3 primer lies upstream of the 157/158 
transcription starting site. 

Primer extension experiments used the same antisense oli- 
gonucleotides, Z2 and Z3, as used in the SI analyses. Primer 
extensions on RNA from cultured day- 14 sternal chondrocytes 
gave products of the same sizes as the corresponding Sl-protect- 



ing experiments, corifirming the three transcription starting sites 
at positions 1, 40, and 157-158, as shown in Fig. 4, A and B, lanes 
3 and 6. These results are represented schematically in Fig. 4Z>. 

Functional Analysis of the Aggreean Promoter Sequence — 
Transient transfections of day- 14 chick embryo sternal chon- 
drocytes with the construct Ag-1(+) (the forward orientation of 
the 1.8-kb insert in the promoter/enhancer-free pGL2-Basic 
reporter vector) revealed a plasmid dose-dependent level of 
luciferase expression (Fig. 5A), i.e. increasing concentrations of 
transfected construct produced increases in luciferase activity, 
establishing that the 1.8-kb region contains elements capable 
of promoter function. In subsequent experiments, constructs 
Ag-1(+) and Ag-l(-), in addition to pGL2-Basic vector with no 
insert, were transiently transfected into both 14 day-old chick 
sternal chondrocytes and, to examine tissue specificity, into 10 
day-old chick embryo fibroblasts. In transfected chondrocytes, 
the construct Ag-1( + ) produced a 45-fold increase in luciferase 
activity compared with the no-insert control (Fig. 5B), whereas 
transfected fibroblasts produced less than a 10-fold increase. 
Transfections with either the negative control pGL2-Basic vec- 
tor with no insert or the Ag-l(-) construct resulted in much 
lower luciferase expression, with activity equivalent to back- 
ground in both transfected chondrocytes and fibroblasts. 

A series of constructs that progressively deleted the Ag-1(+) 
sequence was used to relate the locations of potential transcrip- 
tion factor binding sites and secondary structure to promoter 



11570 Analysis of the Chick Aggrecan Promoter Region 

~ 1569 AGA^TCTCCAATCTTAATGA^^ 

MSP__CS NF-E1_CS1 GR-MT-IIA p53_CS 

-1500 GAGCACTGATGTCAAGATCATGG^ 

ElA-F_CS c-fos_SRE_half-site PEA3_CS 

- 1 4 0 0 AGCAGACTATGAAACGGGACATGGTACTAAACTAATTCATG^ 

hsp-70 . 4 

- 13 00 GTGCGAGGCTTGGCX3CAyGGAGCAGGGCC^ 

NFI_CS1 histone_H-4_CS . 2 

-1200 ACACTCTGAAGAAGCATCCATACTGTg^AGGG 

ZRE6 LyF-1 PTFl (CA) n U2snRNA IE1.2 

-1100 AGTAGTATTCTmCC^^ 

C/EBP_SV40-1 NF-E1 . 6 prl300— > malT_.CS 

-1000 GCCTATGCTAGGGAACTGTGCAGGTTCAGTC 

WAP_US5 IElt2 

-900 COCAQCTGTACAGTCTCAGCTGTTCCCCACC m?I^CAGCITCCCCTAGACCCTTC 

GT-2B_RS CIIS2 malT_malPp CK-8-mer 

-800 TCCTCCAGCAC^GTCTCTACTCA 

malT_raalPp MT-I.l LVa_RS CIIS2 

-7 00 CTGCTGAGAATTOlATTACAGGAT^ 

pr9 00 — > lainbda-boxA 

- 6 0 0 ACAAACAGCCCC^TCTCTTAGCCX:^ 

TFIID-EIIA c-Myc_RSl NF-E1.3 NF-El uteroglobin^- 2 . 4_CS 

-500 TTGCCTGCTCACACATCAGGACCACTTC^ 

(CAJn HC3 IgHc.12 MRE_CS2 H-2RIXBP/T3R-alpha-regionII 

-400 CTTTAGGCTCCCCAGTACATGTGCTCATTTCT 

ZRE7 ■ Hox-1 

-300 CATCITATTTTAGTTK^^ 

prA — > IgHC.21 pr500— > prB — > c-mos BRV-E2_CS2 

-200 C^AGGGCTAAATTTAATCCCCC^CT^ 

histone-CAP-box CAP -box C/EBP_CS1 

IE1 . 2prD— >Spl_CS4 GR-MT-IIA AP-2_CS4 prE~> 

1 ACCTTTACTCTAAAATCCAGAGGAGG^ 

t#l TATA- like-motif f#2 TATA- like-motif TFII-I-HIV-l-Inrl 



101 AGCaGTGGCAG£TAATj£T<3GTCIX3 ^ 

<--prZ3 Isl-l CIIS1 histone_CAP„box <CA) n tt#3 prF— > 

201 GGACCTGAAATAGC^eACAGGGAA 

E-alpha_H_box (CA) n <— prZ2H4TF-lhist<— prZl ~ — ~ 



301 AAGACA rACAGCAACAGCAAGAAGTGGCAAGCTCTTTCCGT^ 
< — prZO 



401 GCAC7IGCACTGAACTGTTAAAG GTAAACTATGACCACTCTACTACTAGTGTTTGTGT^ 458 

>8KB intron 1 MTTLLLVFVC 

Fig. 3. Nucleotide sequence and putative regulatory elements of the 5' flanking region of the chick aggrecan gene. The three major 
transcription start sites are indicated by the daggers followed by the respective number. Putative transcription factor binding sequences are 
underlined, and the GenBank names are printed below the underlined sequence. Overlapping binding sequences are in italic print. Sites were 
defined using the program FINDPATTERNS or from published papers. For clarity, only selected potential transcription factor binding sequences 
have been shown. Additionally, the 5' region of primers designed to create the various deletion constructs are highlighted in boldface in the 
sequence, and below the sequence, the names of the primers are printed in boldface with arrows indicating the direction of their orientation. 
Dashed lines represent the >8 kb of intron that separates exon 2 (SP) from the first untranslated exon. The boxed sequence represents the region 
of the clone Ag-1 that overlaps with the cDNA sequence published previously. Note that position +307 is the Bbsl cutting site; thus, the Ag-1 
sequence ends at this point. 



function and tissue specificity. The constructs and transfection construct, as well as a tract of 21 dA residues from the down- 
results are summarized in Fig. 6. The initial deletion removed stream end. The resulting construct, 1300(+), produced a mod- 
approximately 500 bp from the upstream end of the Ag-1(+) est increase in luciferase activity in chondrocytes versus that 



Analysis of the Chick Aggrecan Promoter Region 



11571 




Kpnl 

ii I ■.jIiIII I .M.LIU I U— 



1 



ZZproba 



22 probe 



Kpnl 



1. Denature plasmfcJ ON A 

2. Hybridize radiolabfed probe 22 or 23 to denatured plasmid ON A 

3. Extend with Wenow „. . 
Kpn I 

Z2prab» 23 prob* 



. Digest with Kpn t 
. Alkaline ge) electrophoresis 
-. Extract angle stranded radjolabkKi probe 



1 



Z2pfob« 



ISESSSESISfiP* 

Z3 probtt 



Kpnl 



73 profa« 



23 probe 



mRN/S 



mRNAEx 



7. Hybridize single stranded DNA to chick cartilage RNA 

8. Digest with SI Nuclease 

9. Size products on a pctyacryiamsde gel 



feaitfwi p* coordinate 1 
8SM3P* coordinate 40 
EDP* coordinate 157/158 



fS3>* coordinate 1 
coordinate 40 



D DP* 22 probe Sp*Z3 probe 



mRNA |i i » i i i i n i i i n,i| mRNA i 

1. Hytffdizo22qrZ3pK>betoc 



W Z. Extend wSh AMV Reverse Transcrptase 
T 3. sbe products on odyacrylamide gel 

mmmefW * coordinate 1 \m*>vv * coordinate 1 

GSDP* coordinate 40 «*E3>* coordinate 40 

UP* coordinate 157/158 

Fig. 4. Si analysis and primer extension. Conditions are described under "Experimental Procedures." A, the results of Si analysis, 
sequencing, and primer extension from the oligonucleotide Z2. Lane 1, SI protection bands resulting from the D(+)-derived probe spanning 
nucleotides -69 to +226; lane 2, products resulting from the probe derived from the 900(+) construct, spanning the region -638 to +226; lane 3, 
results of a primer extension experiment using 32 P-end-labeled oligonucleotide Z2. B, results of Si analysis, sequencing, and primer extension from 
the oligonucleotide Z3. Conditions for SI analysis and primer extension were the same as for A. Lane 4, Si protection products from the 
single-stranded DNA probe spanning the region -69 to +101, derived from D(+); lane 5, products from the probe spanning the region -638 to 
+ 101, derived from 900(+); lane 6, results of a primer extension experiment using ^-end-labeled oligonucleotide Z3. Arrows, the location of the 
major bands. The bands at position 157-158 consistently appear as a doublet in both Si analysis and primer extension experiments. Only bands 
that were generated in both types of experiments were marked; other bands are potentially artifactual because they cannot be duplicated in the 
complementary experiment. C and D schematically show the design and results of the SI protection and primer extension experiments, 
respectively. Open boxes, the radiolabeled oligonucleotide Z2; slashed boxes, the radiolabeled oligonucleotide Z3. Bricks, RNA; * above the RNA, the 
determined transcription start sites. 



promoted by the construct Ag-1(+). Transfected fibroblasts 
showed little difference in luciferase activity from Ag-1(+) to 
1300(+); the latter was slightly lower. Deletion of another 500 
bp from the 5' end (including a CIIS2 site) generated the 
construct 900(+); this deletion had a dramatic effect, because 
both chondrocyte and fibroblast luciferase yields nearly tripled 
when compared with assays of the original Ag-1(+) construct 
(to 140- and 30-fold, respectively). Although chondrocyte activ- 



ity remained substantially higher than that in fibroblasts, 
there was a greater proportional increase in luciferase activity 
in fibroblasts, 260% when compared with the 1300(+) construct 
in fibroblasts versus a 160% increase in chondrocytes. This 
increase may be due to loss of tissue specificity or to coinciden- 
tal but independent effects of silencers in both cell types. 

Removal of approximately 400 additional bp from the up- 
stream end of the 900(+) construct (including another CIIS2 



11572* 



Analysis of the Chick Aggrecan Promoter Region 



A B 




< forward reverse no insert 



Amount Plasmid, Ag-1(+) Constructs 

Fig. 5. Promoter activity in the 5' flanking region of the chick aggrecan gene. A, the dose-dependent luciferase activity curve resulting 
from the expression of the Ag-1(+) construct. Ag-1(+) is the 1.8-kb promoter/enhancer region from the aggrecan gene placed in the reporter vector 
pGL2-Basic. Various amounts of plasmid, ranging from 5 to 15 pmol, were transfected into day-14 chick embryo sternal chondrocytes. Duplicate 
plates were transfected, and each plate was assayed for luciferase activity three times. An average value and S.D. (bars) were determined for all six 
assays at each data point. Results were normalized by cotransfection of 5 /ig of a 0-galactosidase reporter plasmid (Promega). B, the orientation and 
cell type specificity of the construct Ag-1(+). The activity of the reporter vector, pGL2-Basic, with no insert was defined as one and used to calculate 
the relative activities of the other constructs. Again, £-galactosidase expression was used to normalize the plates for trans fection efficiency and cell 
loss; statistical analysis was done in the same fashion as above. Both day-14 chick sternal chondrocytes and day- 10 chick fibroblasts were 
transfected. At the time of transfection, cell density per dish was approximately 5 million. The transfection was allowed to proceed for 36 h. 



site) produced the 500(+) construct. Promoter activity in chon- 
drocytes returned to approximately 50-fold, similar to that 
assayed for the constructs Ag-1(+) and 1300(+); yet in fibro- 
blasts, luciferase activity of the 500(+) construct was only 
slightly lower than that seen for the 900(+) construct (Fig. 6). 
This finding suggests that the upstream half of 900(+) may 
contain enhancer elements that are used in chondrocytes. 

A newly generated construct A(+), 590 bp, was made that 
was similar to the 500(+) construct, except that the insert 
contained the 3' stretch of poly(dA) regions and 36 bp in the 5' 
direction to include the putative IgHC.21 site (Fig. 3), These 
changes produced a modest increase in luciferase activity in 
chondrocytes only. Measured luciferase activity in fibroblasts 
modestly decreased when compared with the luciferase activity 
measured from fibroblasts transfected with the 500(+) con- 
struct. The deletion construct B(+), 547 bp, which does not 
contain the IgHC.21 site, lost approximately 40% of the activity 
of the A(+) construct in chondrocytes; the activity in fibroblasts 
was reduced by 70%, resulting in luciferase activity as low as 
that seen for many of the (-) constructs. A further deletion 
construct, D(+), 376 bp, which included only the three tran- 
scription start sites and the putative Spl and AP2 binding 
sites, produced a significant amount of luciferase activity in 
chondrocytes (nearly 60-fold), and in transfected fibroblasts 
luciferase activity was equivalent to the 1.8-kb Ag-1(+) con- 
struct. The D(+) construct deleted the poly(dT) region but 
included the poly(dA) region. The 308-bp construct, E(+), in- 
cluded the three major start sites at positions 1, 40, and 157/ 
158 but did not include the consensus sequences Spl-CS4, 
GR-MT-IIA, and AP-2-CS4. Deletion of these potential nuclear 
factor binding sites caused a 75% loss of activity in chondro- 
cytes while not substantially altering luciferase activity in 
transfected fibroblasts. Construct E(+) had comparable lucif- 
erase activity in both chondrocytes and fibroblasts of approxi- 
mately 15-fold when compared with the no-insert control vec- 
tor. The 140-bp construct F(+) did not include any of the 
determined starting sites and produced modest luciferase ac- 
tivity in transfected chondrocytes and baseline luciferase ac- 



tivity in transfected fibroblasts. In all but one instance, the 
reverse orientation constructs of all of these genomic fragments 
yielded minimal luciferase activity in both transfected chon- 
drocytes and fibroblasts. That exception, the activities seen for 
the 500(-) construct, suggests that some low-level promoter 
activities may result from largely accidental sequence assem- 
blages. In sum, the data suggest the following functional roles 
for portions of the aggrecan 5' flanking sequence in the two cell 
types: 1) general repression upstream of the pr900 site, espe- 
cially between -638 and -1038 (prl300); 2) strong chondro- 
cyte-specific enhancement in the pr900-pr500 interval (—638 to 
-247); 3) a positive element, possibly IgHC.21, occurs in the 
small prA-prB interval (-283 to -240); 4) the prB-prD seg- 
ment (-240 to -69) has a negative role, strongest in fibro- 
blasts; and 5) the small (-69 to -1) pD-prE interval, bearing 
SP1 and AP-2 elements, is stimulatory in chondrocytes. It is 
also apparent that constructs lacking either the dT or dA tracts 
(e.g. 900(+) and D(+)) are quite active; therefore, interaction 
between these repeats is not required for promoter function in 
this system. 

DISCUSSION 

We have found that a 1.8-kb genomic fragment from the 5' 
end of the chick aggrecan gene is able to drive expression of the 
pGL2-Basic luciferase reporter gene in a tissue-specific man- 
ner. Determining the sequence of this construct revealed more 
than 202 potential transcription factor binding sites. This 
structural information allowed us to proceed with a functional 
analysis of the effects of potentially active cis elements that 
may confer tissue and developmental specificity on expression 
of the aggrecan gene by using a series of nested deletion con- 
structs. These sequences ranged from the full 1.8 kb (Ag-1(+)) 
to a minimal 140-bp construct (F+). 

Of the numerous potential cis elements found in the Ag-1 
sequence, several are of particular interest with respect to 
control of aggrecan expression. Positions -873 and -721 in the 
Ag-1 sequence are the 5' ends of two copies of the sequence 
CACCTCC (CIIS2), which has been suggested to be a silencer 



Analysis of the Chick Aggrecan Promoter Region 



11573 




EGF-fike 63 CRWfce 





sr^:i^t- r ^.£, aJ 




LUCIFERA5E 
GENE | 












LUCIFEHA5E 
GENE | 








LUWbKASE— 


No-insert control 


GENE 



pGL2 Basic 



As-1(-) 
500- 
A- 

B-fc 
O- 



Direction of vector transcription dgl-2 Bask 



■ Chondrocytes 
ED Fibroblasts 



50 



10O 



150 



Fold Increase 



Fig. 6. Structure and differential promoter functions of the aggrecan 5' flanking region. A, a schematic of the genomic structure of the 
chick S103L-reactive CSPG (aggrecan) gene. B, the set of deletion constructs. Inserts derived from the aggrecan promoter/enhancer region were 
ligated into the pGL2-Basic (Promega) reporter vector, which carries the luciferase gene. Both the forward and reverse orientations were 
constructed as indicated by (+) or (-). Subsequent deletion constructs were generated by PCR using the construct Ag-1(+) as a template (see 
"Experimental Procedures.*' C, relative luciferase promoter activity of the various deletion constructs. The activity of the reporter vector with no 
insert was defined as one and used to calculate the activities of the other constructs. Duplicate plates were transfected, and each plate was assayed 
for luciferase activity three times. All experimental details are as presented in the legend to Fig. 5. 



motif in the COL2A1 promoter (29). This particular sequence 
has been shown to inhibit transcription of the type II collagen 
promoter in fibroblasts while not significantly changing expres- 
sion in chondrocytes (29). Indeed, this seems to be consistent 
with our results because deletion of these two motifs from the 
1300(+) to 900( + ) constructs reduced the cell type specificity of 
luciferase expression while the overall promoter activities in- 
creased. This motif is also present in the promoter region of 
COL4A2; however, tissue-specific regulation in fibroblasts ver- 
sus chondrocytes remains to be investigated in this system (30). 

The chick aggrecan 5' flanking region contains a second 
silencer consensus sequence, (CIIS1) ACCCTCTCT (29) at po- 
sition 127, which is also found in COL2A1. The CIIS1 sequence 
occurs in an interspersed rat repetitive sequence (31) and in 
another repetitive sequence found in the avian genome named 
the CR1 element (32, 33). Further negative regulatory func- 
tions have been shown in the chick lysozyme gene (34), rat 
insulin gene (31), mouse IgH gene (35), human ^-interferon 
gene (36), and the human e-globin gene (37). In the Ag-1 se- 
quence, this motif is located within 200 bp downstream of the 
putative Spl site. A "push and pulF mechanism has been 
proposed for transcriptional regulation in two systems, the low 
density lipoprotein receptor gene and the COL2A1 gene (29, 
38). This model proposes that the sterol-dependent binding of a 
protein to a consensus sequence could inhibit the positive ac- 
tivation of a nearby Spl binding site (38); such a silencer 
element acting in a "push and pull" mechanism could likewise 



be responsible for the temporal and tissue-specific regulation of 
the aggrecan gene. 

The Ag-1 sequence contains one putative NF-I site at posi- 
tion -1282. The NF-I proteins are transcriptional activators 
derived from a multigene protein family in the vertebrate phy- 
lum (39-42). Chick tissues contain NF-I products that are 
derived from four separate genes that have the potential of 
producing 12 isoforms (42). Recently, it has been shown that 
the silencer SI is very similar to the NF-I/CTF family, and an 
additional silencer, SII, is similar to an NF-I/CTF half site (43) 
This suggests that NF-I-related proteins can mediate tran- 
scriptional repression in cells of mesenchymal origin (42). Our 
sequence does not contain the sequence motifs of SI or SII, but 
Szabo et al. (43) suggest that the NF-I family of regulator 
proteins can be modulated as silencers in addition to their 
previously accepted role as activators. The presence of a puta- 
tive NF-I site raises the possibility of mesenchyme-specific 
regulation controlled by this element in addition to possible 
modulation by unreported silencers, thus creating a more dy- 
namic system than one based solely on NF-I activation. 

From footprinting analysis, Long and Linsenmayer (44) re- 
ported a novel transcription factor binding sequence, ACACA- 
CAGA, acting in the regulation of COL10A1, and suggested 
that this factor may act as a silencer. The proximal promoter 
region of COL10A1 is responsible for regulating expression in 
hypertrophic chondrocytes (44). Our reported sequence con- 
tains four positions, -1140, -491, 151, and 214, where the 



11574* 



Analysis of the Chick Aggrecan Promoter Region 



CACACA motif is present. Perhaps these sequences are in- 
volved in chondrocyte-specific expression of aggrecan. The CA- 
CACA motif may also be relevant because repeats of (CA) n are 
markers for Z-DNA formation, contributing to secondary struc- 
ture (45). Moreover, this motif has been shown to be a potential 
hot spot for recombination and can contribute to gene expres- 
sion (26). Clustering of these sequences near the transcrip- 
tional start sites that have been identified for chick aggrecan 
may contribute to the mechanism of transcriptional regulation 
by altering DNA secondary structure. 

The chick aggrecan promoter exhibits <40% sequence simi- 
larity to either the mouse promoter (19) or the 120-bp rat (20) 
promoter fragment, indicating that this promoter/enhancer re- 
gion is not highly conserved across the taxa. Interestingly, the 
untranslated first exon in chick aggrecan contains less than 
45% similarity compared with rat, mouse, or human sequences 
(19, 20, 46). Although the lack of identifiable similarity be- 
tween the chick and mammalian aggrecan first exons might be 
attributable to the existence of fewer selection pressures on an 
untranslated sequence, this argument is not readily extended 
to promoter sequences. Also puzzling is that although the rat 
and mouse promoter sequences share 93% identity with each 
other, none of the described transcriptional start sites coincide 
with each other in these two similar promoter regions. 

There are, however, similarities in TATA-binding motifs 
among promoters of cartilage-specific genes. As is the case for 
the mouse and rat aggrecan promoter regions, the chick 5' 
flanking sequence lacks a classical TATA box and contains 
multiple transcriptional start sites (19, 20). Although a TATA- 
less promoter with multiple GC-rich regions is the hallmark of 
many housekeeping genes (47), many other genes that are 
temporally regulated have been shown to have promoters with 
similar structures (48, 49). It is interesting that the 5' flanking 
sequence of the chick link protein gene also contains multiple 
transcription start sites and lacks a classical TATA box (50); 
rather, it has a TATA motif-like sequence TCTAA (51). The 
chick aggrecan sequence contains two TCTAA motifs, one that 
is 31 bp and another that is 94 bp upstream of the start sites at 
positions 40 and 157-158, respectively (Fig. 3). The TCTAA 
sequence is also present in the human and chick link protein 
promoter region (50, 52) and in the serine/glycine-rich proteo- 
glycan (51). However, human link protein has only one tran- 
scription start site (52). Thus, it would be interesting to deter- 
mine whether the human aggrecan sequence has only one 
transcription start site, which would provide further evidence 
for similarity in the evolution of the link protein and the ag- 
grecan genes, as has been suggested (13). 

Overall, this study has established the 5' flanking sequence 
as having three major transcription start sites in addition to 
several putative cis elements and a potential secondary struc- 
ture that may control expression of the aggrecan gene. We have 
demonstrated tissue-specific promoter activity with the 1.8-kb 
region and have systematically mapped subregions that produce 
activation or repression of downstream reporter genes in two cell 
types in culture. This study paves the way for more directed 
studies of the individual cis elements identified and their inter- 
action with trans-acting factors so that we may better under- 
stand the mechanisms by which the aggrecan gene is regulated. 

Acknowledgment — We thank Dr. Miriam Domowicz for helpful sug- 
gestions during the course of this study. 

REFERENCES 

1. Domowicz, M. S., Li, H., Hennig, A. K., Vertel, B., and Schwartz, N. B. (1995) 

Dev. Biol. 171, 655-664 

2. Geetha-Habib, M., Campbell, S., and Schwartz, N. B. (1984) J. Biol Chem. 

259, 7300-7310 

3. Campbell, S. C., and Schwartz, N. B. (1988) J. Cell Biol 106, 2191-2202 

4. Kearns, A. E., Vertel, B. M., and Schwartz, N. B. (1993) J. Biol Chem. 268, 



11097-11104 

5. Schwartz, N. B. (1995) Trends Glycosci. Glycotecknol 7, 429-445 

6. Bourdon, M. A., Krusius, T., Campbell, S., Schwartz, N. B., and Ruoslahti, E. 

(1987) Proc. Natl Acad. Sci. U. S. A 84, 3194-3198 

7. Campbell, S. C, Krueger, R. C., and Schwartz, N. B. (1990) Biochemistry 29, 

907-914 

8. Krueger, R. C, Jr., Fields, T. A., Hildreth, J., IV, and Schwartz, N. B. (1990) 

J. Biol Chem. 265, 12075-12087 

9. Krueger, R. C., Jr., Fields, T. A., Mensch, J. R, Jr., and Schwartz, N. B. (1990) 

J. Biol Chem. 265, 12088-12097 

10. Dennis, J. E. f Carrino, D. A., Schwartz, N. B., and Caplan, A I. (1990) J. Biol 

Chem. 265, 12098-12103 

11. Li, H., Schwartz, N. B., and Vertel, B. M. (1993) J. Biol. Chem. 268, 

23504-23511 

12. Li, H., Domowicz, M. S., Hennig, A., and Schwartz, N. B. (1996) Mol Brain 

Res. 36, 309-321 

13. Ii, H., and Schwartz, N. B. (1995) J. Mol Evol 41, 878-885 

14. Levitt, D., and Dorfman, A. (1972) Proc. Natl Acad. Sci U. S. A 69, 

1253-1257 

15. Schwartz, N. B., Hennig, A. K., Krueger, R. C, Krzystolik, M., Li, H. ( and 

Mangoura, D. (1993) in Limb Development and Regeneration (Fallon, J. F., 
Goetinck, P. R, Kelley, R O., and Stocum, D. L., eds) pp. 505-514, Wiley- 
Lass, Inc., New York 

16. Myers, J. C, and Dion, A S. (1990) in Extracellular Matrix Genes (Sandell, L. 

J., and Boyd, C. D., eds) pp. 57-78, Academic Press, Inc., San Diego 

17. Sandell, L. J., and Boyd, C. D. (1990) in Extracellular Matrix Genes (Sandell, 

L. J., and Boyd, C. D., eds), pp. 1-56, Academic Press, Inc., San Diego 

18. Lee, B., DAlessio, M., and Ramirez, F. (1991) Crit. Rev. Eukaryot Gene Expr. 

1, 173-187 

19. Watanabe, H., Gao, L., Sugiyama, S., Doege, K., Kimata, K., and Yamada, Y. 

(1995) Biochem. J. 308, 433-440 

20. Doege, K. I., Garrison, K, Coulter, S. N., and Yamada, Y. (1994) J. Biol Chem, 

269, 29232-29240 

21. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A 

Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, 
NY 

22. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A 

74, 5463-5467 

23. Program Manual for the Wisconsin Package, version 8 (1994) Genetics Com- 

puter Group, Madison, WI 

24. Boguski, M. S., Caballero, L., Eisenberg, D., Elliston, K., Luthy, R, Rice, P. M., 

and States, D. J. (1992) Sequence Analysis Primer (Gribshov, M., and 
Devereux, J., eds), W. H. Freeman and Co., New York 

25. Ghosh, D. (1992) Nucleic Acids Res. 20s, 2091-2093 

26. Cahn, R. (X, Coon, H. C, and Cahn, M. B. (1967) in Methods in Developmental 

Biology (Wilt, F. H., and Wessels, N. K., eds) Thomas Y. Crowell Co., 
New York 

27. Sharp, P. A, Berk, A. J., and Berget, S. M. (1980) Methods Enzymol 65, 

750-768 

28. Adams, A. L. P., Knowler, J. T., and Leader, D. P. (1992) The Biochemistry of 

the Nucleic Acids, Chapman and Hall, New York 

29. Savagner, P., Miyashita, T., and Yamada, Y. (1990) J. Biol Chem. 265, 

6669-6674 

30. Saitta, B., and Chu, M. (1994) Eur. J. Biochem. 223, 675-682 

31. Laimins, L., Holmgren-Konig, M., and Khoury, G. (1986) Proc Natl Acad. Sci. 

U. S. A 83, 3151-3155 

32. Hache, R J. G. t and Deeley, R. G. (1988) Nucleic Acids Res. 16, 97-113 

33. Stumph, W. E., Hodgson, C. P., Tsai, M.-J., and O'Malley, B. W. (1984) Proc. 

Natl Acad. Sci. U. S. A 81, 6667-6671 

34. Baniahmad, A, Muller, M, Steiner, C, and Renkawitz, R. (1987) EMBO J. 6, 

2297-2303 

35. Kadesh, T., Zervos, P., and Ruezinsky, D. (1986) Nucleic Acids Res. 14, 

8209-8221 

36. Goodbourn, S., Burstein, H., and Maniatis, T. (1986) Cell 45, 601-610 

37. Cao, S. X., Gutman, P. D., Dave, H. P. G., and Schechter, A N. (1989) Proc. 

Natl Acad. Sci. U. S. A 86, 5306-5309 

38. Dawson, P. A, Hofinann, S. L., van der Westhuyzen, D. R., Sudhof, T. C, 

Brown, M. S., and Goldstein, J. L. (1988) J. Biol Chem. 263, 3372-3379 

39. Paonessa, G., Gounari, F., Frank, R., and Cortese, R. (1988) EMBO J. 7, 

3115-3123 

40. Santoro, C, Mermod, N., Andrews, P. C, and Tjian, R. (1988) Nature 334, 

218-224 

41. Rupp, R. A W., Kruse, U., Multhaup, G., Gobel, U., Beyreuther, K., and Sippel, 

A E. (1990) Nucleic Acids Res. 18, 2607-2616 

42. Kruse, U., and Sippel, A E. (1994) J. Mol Biol. 238, 860-865 

43. Szabo, P., Moitra, J., Rencendorj, A, Rakhely, G., Rauch, T., and Kiss, I. (1995) 

J. Biol. Chem. 270, 10212-10221 

44. Long, F., and Linsenmayer, T. F. (1995) J. Biol Chem. 270, 31310-31314 

45. Vogt, P. (1990) Hum. Genet. 84, 301-336 

46. Valhmu, W., Palmer, G., Rivers, P., Ebara, S., Cheng, J M Fischer, S., and 

Ratcliffe, A (1995) Biochem. J. 309, 535-542 

47. Sehgal, A, Patil, N., and Chao, M. (1988) Mol Cell Biol 8, 3160-3167 

48. Bohm, S. K., Gum, J. R., Jr., Ericson, R. H., Hicks, J. W„ and Kim, Y. S. (1995) 

Biochem. J. 311, 835-834 

49. Wick, M., Haronen, R., Mumberg, D., Burger, C, Olsen, B. R., Budarf, M L. ( 

Apte, S. S., and Muller, R. (1995) Biochem, J. 311, 549-554 

50. Deak, R, Barta, E., Mestric, S., Biesold, M., and Kiss, I. (1991) Nucleic Acids 

Res. 19, 4983-4990 

51. Avraham, S., Avraham, H., Austen, K F., and Stevens, R. L. (1992) J. Biol 

Chem. 267, 610-617 

52. Dudhia, J., Bayliss, M. T., and Hardingham, T. E. (1994) Biochem J. 303, 

329-333 



The EMBO Journal Vol.17 No.19 pp.5718-5733, 1998 



A new long form of Sox5 (L-Sox5), Sox6 and Sox9 
are coexpressed in chondrogenesis and 
cooperatively activate the type II collagen gene 



Veronique Lefebvre 1 , Ping Li and 
Benoit de Crombrugghe 1 

Department of Molecular Genetics, The University of Texas M.D. 
Anderson Cancer Center, 1515 Holcombe Boulevard, Box 11, 
Houston, TX 77030, USA 

'Corresponding authors 

e-mail; benoit_decrombrugghe@molgen.mdacc.tmc.edu or 
veronique_lefebvre@molgen.mdacc.tmc.edu 

IVanscripts for a new form of Sox5, called L-Sox5, 
and Sox6 are coexpressed with Sox9 in all chondrogenic 
sites of mouse embryos. A coiled-coil domain located 
in the N-terminal part of L-Sox5, and absent in Sox5, 
showed >90% identity with a similar domain in Sox6 
and mediated homodimerization and heterodimeriza- 
tion with Sox6. Dimerization of L-Sox5/Sox6 greatly 
increased efficiency of binding of the two Sox proteins 
to DNA containing adjacent HMG sites. L-Sox5, Sox6 
and Sox9 cooperatively activated expression of the 
chondrocyte differentiation marker Collal in 10T1/2 
and MC615 cells. A 48 bp chondrocyte-specific 
enhancer in this gene, which contains several HMG- 
like sites that are necessary for enhancer activity, 
bound the three Sox proteins and was cooperatively 
activated by the three Sox proteins in non-chondrogenic 
cells. Our data suggest that L-Sox5/Sox6 and Sox9, 
which belong to two different classes of Sox transcrip- 
tion factors, cooperate with each other in expression 
of Collal and possibly other genes of the chondrocytic 
program. 

Keywords: chondrogenesis/collagen2/Sox5/Sox6/Sox9 



Introduction 

Sox (Sry-type HMG b ox) proteins, which form a subfamily 
of DNA-binding proteins with a high-mobility-group 
(HMG) domain, have critical functions in a number of 
developmental processes, including sex determination, 
neurogenesis and skeleton formation (Laudet et al, 1993; 
Pevny and Lovell-Badge, 1997; Southard-Smith et ai, 
1998). Individual members of the Sox family show >50% 
identity in their HMG domain to Sry, the testis-determining 
factor (Wright et ai, 1993). An essential role for S0X9 
in skeleton formation was demonstrated with the identi- 
fication of mutations in S0X9 in patients with campomelic 
dysplasia (Foster et ai, 1994; Wagner et ah, 1994; 
Kwok et ai 9 1995; Meyer et aL, 1997). This disease is 
characterized by severe malformations of essentially all 
cartilage-derived structures and is also often associated 
with XY sex reversal (Houston et ah, 1983; Mansour 
et ai, 1995). 

Cartilage formation is a complex and essential process 



in vertebrates. Cartilages are obligatory templates for the 
formation of endochondral bones during development 
and also constitute permanent skeletal structures in the 
respiratory tract, in articular joints and other organs. 
Chondrocytes differentiate following condensation of 
mesenchymal cells in different locations of the embryo, 
including the frontonasal mass, branchial arches, sclero- 
tomes and limb buds. Typically, chondrocytes express a set 
of genes encoding cartilage-specific extracellular matrix 
components such as collagen II (encoded by the Collal 
gene), collagens IX and XI, and aggrecan. In growth 
plates, chondrocytes undergo further differentiation and 
hypertrophy, producing a matrix in which type X collagen 
is abundant and calcification occurs. Apoptosis follows 
and cartilage is replaced by bone. Our understanding of 
chondrogenesis at the molecular level is still limited. 
Several cytokines, including bone morphogenetic proteins, 
Indian Hedgehog, parathyroid hormone-related peptide 
and fibroblast growth factors, are involved in either skeletal 
patterning or discrete steps of the chondrogenic pathway, 
and several transcription factors, such as Hox and Pax 
family members, are involved in patterning of skeletal 
primordia (Cancedda et aL, 1995; Erlebacher et al. 9 1995; 
Hall and Miyake, 1995). However, less is known about 
the transcription factors that control the determinative 
switch for chondrocyte differentiation and the activation 
of marker genes at each step of the chondrogenic cascade. 

Our laboratory has used genes for specific cartilage 
matrix components to identify transcription factors that 
control gene expression in chondrocytes. We have shown 
that a multimerized 48 bp sequence in the first intron 
of Collal is sufficient to confer chondrocyte-specific 
expression both in transgenic mice and in transient trans- 
fection of cultured cells (Lefebvre et ai, 1996). S0X9 
binds to the enhancer and activates Collal constructs in 
transient transfections of non-chondrocytic cells (Lefebvre 
et ai, 1997) and in transgenic mice (Bell et al, 1997). 
S0X9 also activates the Collal gene when ectopically 
expressed in some non-cartilaginous sites in transgenic 
mice (Bell et al, 1997). Moreover, Sox9 is expressed along 
with Collal during chondrogenesis in mouse embryos 
(Wright et a/., 1995; Ng et aL, 1997; Zhao et ah, 1997). 
Therefore, direct activation of C0L1A1 is believed to be 
an important function of S0X9 in chondrogenesis 
(Lefebvre and de Crombrugghe, 1998). 

Several lines of evidence suggest that other transcription 
factors in addition to S0X9 may be needed to specify the 
high-level expression of C0L1A1 in chondrocytes. Sox9 
is expressed in cells that do not express Collal , such as 
those in genital ridges and specific areas of the embryonic 
heart (Ng et aL, 1997; Zhao et al t 1997). Sox9 is highly 
expressed in the Sertoli cells of the testis and is involved 
in male gonad differentiation (Kent et al> 1996; Morais 
da Silva et ah, 1996). The target genes of Sox9 in these 



5718 



© Oxford University Press 



Col2a1 activation by L-Sox5, Sox6 and Sox9 



cells and in chondrocytes are not clearly defined, but the 
phenotypes of these two cell types are so different that it 
is likely that Sox9 contributes to their differentiation 
by controlling expression of different genes. Different 
functions of Sox9 in these cells must be specified by 
differential expression of other factors. Ectopic expression 
of SOX9 in transfected cells, in which CoUal was silent, 
did not result in CoUal activation (V.Lefebvre and B.de 
Crombrugghe, unpublished data), and ectopic expression 
of SOX9 in transgenic mice led to activation of CoUal 
only in a subset of tissues within the domain of ectopic 
expression of SOX9 (Bell et aL, 1997). We therefore 
hypothesized that other factors either activate or derepress 
SOX9 or cooperate with SOX9 in COL2A V-expressing 
cells. 

We recently reported that the 48 bp enhancer of CoUal 
formed a large and abundant complex with nuclear proteins 
from chondrocytes but not from other cells (Zhou et aL, 
1998). These proteins were designated CSEPs, for 
chondrocyte-specific enhancer-binding proteins. They 
included Sox9 and unidentified protein(s). CSEPs appeared 
to contact the 48 bp DNA at several sites homologous 
to a consensus for HMG-domain proteins. Mutagenesis 
demonstrated a good correlation between binding of 
CSEPs to DNA and enhancer activity in chondrocytes, 
both in transient transfection experiments and in transgenic 
mice. In addition, we showed that two chondrocyte- 
specific enhancer elements located in the Colli a2 promoter 
contained HMG-like sites that were essential for enhancer 
activity and formation of an enhancer-CSEP-like complex 
(Bridgewater et aL, 1998). These results suggested that 
other HMG-domain proteins cooperate with Sox9 to 
generate Collal and Colllal enhancer activity and pre- 
sumably gene expression in chondrocytes. 

We show here that in addition to Sox9, CSEPs are 
composed of a new long form of Sox5 (L-Sox5), and of 
Sox6, which both are members of a Sox subclass different 
from that of Sox9. L-Sox5 and Sox6 harbor a coiled-coil 
domain that mediates protein dimerization and efficient 
binding to adjacent HMG DNA sites. The three Sox 
genes are coexpressed in chondrogenesis and cooperate 
in CoUal activation. Our data strongly suggest that 
L-Sox5, Sox6 and Sox9 together contribute to control 
CoUal, and perhaps other important genes of the chondro- 
cyte phenotype. 



Results 

A long form of $ox5 (L-SoxS), Sox6 and $ox9 form 
complexes with the 48-bp Co 12a 1 enhancer 

The CSEP proteins that form a chondrocyte-specific com- 
plex with the 48 bp CoUal enhancer were previously 
shown to include a protein or proteins with an apparent 
M T of 75-95 kDa (Zhou et aL, 1998). These proteins 
exhibited DNA-binding properties of HMG-domain pro- 
teins, including binding to several HMG-like sites in the 
CoUal 48 bp enhancer, binding to a probe containing a 
consensus binding site for HMG-domain proteins (1HMG 
probe), binding to the minor groove of DNA and binding 
to DNA in the presence of poly(dG-dC) but not poly(dI- 
dC) (Zhou et aL, 1998). We also obtained evidence that 
CSEP bound with high affinity to a tandem dimer of the 



1HMG probe (2HMG probe) both in EMSA and in 
Southwestern blots (data not shown). 

On the basis of these results, the 2HMG probe was 
chosen to clone cDNAs for CSEPs by the Southwestern 
screening approach. cDNA expression libraries were made 
from primary chondrocytes of newborn mouse ribs. Several 
clones that showed stronger binding to the 2HMG probe 
in the presence of poly(dG-dC) than poly(dl-dC) encoded 
sequences of Sox5 or Sox6. Interestingly, whereas the 
previously reported transcript for Sox5 had a length of 
2 kb and encoded a 43 kDa protein (Denny et aL, 1992), 
the Sox5 cDNA that was reconstituted from overlapping 
clones was 3.9 kb long and encoded a 75 kDa protein 
corresponding to Sox5 with an additional N-terminal 
sequence (data submitted to DDBJ/EMBL/GenBank; see 
later, in Figure 4). This long form of Sox5 was designated 
L-Sox5. Sox6 cDNA clones encoded Sox6 isoforms ident- 
ical or very similar to those described for testis (data 
submitted to DDBJ/EMBL/GenBank; see later, in 
Figure 4). 

Antibodies were generated against the C-termini of 
Sox5, Sox6 and Sox9, and affinity purified. In Western 
blotting, each antibody species specifically recognized its 
Sox protein target in extracts of fibroblasts transfected 
with Sox expression plasmids (Figure 1 A). In EMSA, each 
antibody preparation specifically supershifted complexes 
formed between DNA and its target (Figure IB). In 
Western blots of RCS cell extracts, the antibodies strongly 
reacted with proteins at the level of L-Sox5, Sox6 and 
Sox9 (Figure 1C), indicating that each of these Sox 
proteins was indeed made by these cells. The same 
reactions were seen with extracts of primary chondrocytes 
and chondrocytic MC615 cells (data not shown), but 
not with extracts of BALB/3T3 (Figure 1C) or 10T1/2 
fibroblasts (see control lanes in Figure IB). Note that Sox5 
antibodies showed no reaction in chondrocyte samples at 
the level of the 43-kDa short form of Sox5. In extracts of 
adult mouse testis, Sox5 antibodies recognized Sox5 but 
not L-Sox5 (Denny et aL, 1992). It appeared, therefore, 
that only L-Sox5 is made in chondrocytes, not Sox5, 
whereas only Sox5 is made in the testis. These protein 
data are consistent with RNA data, which show that 
chondrocytes express only the transcript for L-Sox5 and 
testis only the transcript for short Sox5 (see Figure 2A). 
In EMSA, L-Sox5 and Sox6 made in transfected fibroblasts 
formed complexes with the 48 bp CoUal enhancer probe 
that migrated at the level of the CSEP-DNA complex 
(Figure ID). Sox5 did not bind to the 48 bp element. As 
shown previously (Zhou et aL, 1998), SOX9 formed two 
complexes with the enhancer, a major one migrating faster 
than the CSEP-DNA complex and a minor one migrating 
at the level of the CSEP-DNA complex (Figure ID). 
Sox5, Sox6 and Sox9 antibodies were each able to partially 
supershift the CSEP-48-bp-Co/2a7 enhancer complex 
formed with RCS cell nuclear extracts (Figure IE), indicat- 
ing that each of these Sox proteins contributed to complex 
formation. A virtually complete supershift of the CSEP- 
DNA complex was obtained by incubating EMSA reac- 
tions with both Sox5 and Sox6 antibodies. No further 
supershift was visible when the three Sox protein anti- 
bodies were included in the reactions. The same results 
were obtained with primary chondrocyte extracts (data 



5719 



V.Lefetovre, P.LHand B.de Crombrugghe 



protein : i 




200 -, 

til 

AB; SoxS Sox6 Sox9 



B 



SoarSAB: w^- - - + - - - + - - 

SartJAB: .+ + - + ~ 

Sox9AB: + + _ + 




protein: L-Sox5 Sox6 SOX9 



D 




a; , 2 jl 8 £ 



AB: SwcS Sor6 So*$ 




SoxSAB: -24----22 
Sox6 AB: ..-•34- -.a 2 
SarfAB: - -- -- 24-2 




Wt j^^ATCOttOCTCTdTATOCOCTTCAa wiU^ 

mA9 ccgtoq* • 

ItlAl *«»*»♦##♦♦• «££CA€T£+*»* »#♦ ♦••♦••»»»»♦«#♦ 

mA3 •..*«ACTwo.. ...... 

mA4 gojun&s*.** 

niA6 ..^ul^cas.... 

mA8 .*♦..» cscAC^**.*.*w5ajin£9.».£&uc&i.... 




SOX9 



ptobeft 



Fig. 1. L-Sox5, Sox6 and Sox9 are present in chondrocytes and form the CSEP-Co/2a/-48-bp enhancer complex. (A) Antibodies (AB) against Sox5, 
Sox6 and SOX9 specifically recognized their target in Western blots. Samples were extracts of 10T1/2 fibroblasts transfected with expression 
plasmids encoding no protein (-), L-Sox5, Sox6A or SOX9. The A/ r of protein standards (X 10 3 ) is indicated. Reaction with antibodies is seen at the 
level expected for each Sox protein. (B) Antibodies against Sox5, Sox6 and Sox9 specifically supershifted the complex of their target with DNA. 
L-Sox5 and Sox6 protein samples were cell extracts from 10T1/2 fibroblasts transfected with Sox expression plasmids. The SOX9 sample was the 
product of in vitro transcription/translation with SOX9 plasmid. Samples were used in EMSA with the 2HMG probe. Two microliters of SOX crude 
antiserums were included, as indicated. A preimmune serum was used as a control (first lane for each protein). None of the complexes seen in the 
figure was seen in samples containing no SOX protein (data not shown). SOX9 synthesized in vitro formed two distinct complexes with the 2HMG 
probe. The lower and upper complexes probably involved one and two molecules of SOX9 per molecule of DNA, respectively. (C) Antibodies 
against Sox5, Sox6 and Sox9 identified their target in Western blots of nuclear extracts of RCS cells but not BALB/3T3 fibroblasts. An intense 
reaction with antibodies was seen at the level of each Sox protein. (D) L-Sox5, Sox6 and SOX9 bound to the 48 bp probe. 10T1/2 fibroblasts were 
transfected with expression plasmids for no protein (-), L-Sox5, Sox6, SOX9 or Sox5. Extracts of these cells and nuclear extracts of RCS cells were 
used in EMSA with the 48 bp Collal probe. L-Sox5 and Sox6 formed a complex with the probe that ran with a mobility similar to that of the 
CSEP-DNA complex. As shown previously (Zhou et a!., 1998), SOX9 formed two complexes with the Collal enhancer probe, a minor one that 
migrated at the level of the CSEP-DNA complex (arrow), and a major one that migrated raster (arrowhead). (E) Antibodies against Sox 5, Sox6 and 
Sox9 supershifted the CSEP-48-bp-Co/2a7 complex. Two or four microliters of SOX antiserums (AB) were included in EMSA of RCS nuclear 
extracts with the 48 bp CoUal probe, as indicated. Sox5 preimmune serum was used in the control reaction (first lane). None of the preimmune 
serums supershifted any protein-DNA complex (data not shown). As described previously (Zhou et ah, 1998), Sox9 present in RCS extracts did not 
form a fast migrating complex with DNA (as seen in D). (F) EMSA with wild-type and mutant 48 bp CoUal probes. The upper strand of the wild- 
type 48 bp element (wt) is shown from 5' to 3'. Sites 1-4 are 7 bp HMG-like binding sites. Nucleotides corresponding to those of the heptamer 
consensus HMG site C[A/T]TTG[A/T][A/T] are underlined. Mutated sites are spelt out, whereas wild-type nucleotides are indicated by dots. 
Sites 1-3 and mutants mAl, mA3, mA4, mA6 and mA8 were described previously (Zhou et ah, 1998). Note that site 2 in mA3 harbors the same 
five consensus nucleotides as wild-type site 2, whereas, in other mutations, sites 1-4 retained only three or four consensus nucleotides. Binding of 
CSEP to wild-type and mutant 48 bp probes was tested using RCS nuclear extracts. Binding of L-Sox5, Sox6 and SOX9 was tested using extracts of 
10T1/2 cells transfected with either one of the Sox protein expression plasmids. All probes had the same radioactivity. Arrow, migration level of 
CSEP. Arrowhead, fast-migrating complex of SOX9 with DNA. 



5720 



Col2a1 activation by L-Sox5 f Sox6 and Sox9 




Fig. 2. Long transcripts of Sox6 and Sox5 are expressed at high levels in chondrocytes. (A) Northern blot with total RNA. Samples were as follows: 
Pr. Ch., primary chondrocytes; 2d p. Ch., chondrocytes at the second passage in culture; RCS cells; newborn mouse brain and adult mouse testis. 
Hybridization was performed with probes that recognized both the long and short transcripts of either Sox5 or Sox6, Staining of 28S and 18S rRNA 
is shown as the reference for RNA loading. The sizes of Sox5 and Sox6 transcripts were calculated by comparison with the migration of RNA 
standards. (B) Northern blot with total RNA. Samples were as follows: Pr. Ch., primary chondrocytes; Pr. sk. fibr., primary skin fibroblasts from 
newborn mice; RCS cells; MC615 mouse immortalized chondrocyte cells at an early passage; 10T1/2 and BALB/3T3 mouse embryo fibroblasts; 
ROS rat osteosarcoma cells; C 2 C I2 mouse myoblastic cells; EL4 mouse lymphoma cells; HeLa human carcinoma cells; COS monkey kidney cells; 
and MCTs, mouse immortalized chondrocytes. Hybridization was performed with probes for the long and short transcripts of either Sox5 or Sox6. 
Only signals at the level of the long transcripts are shown. No significant hybridization was seen at the level of the short transcripts in any sample 
(data not shown). Hybridization with an 1 8S rRNA probe is shown as the control for RNA loading. RNA samples were the same as in Lefebvre 
et ah (1997). (C) Same Northern blot as in (B), but with total RNA from newborn mouse tissues. 



not shown). L-Sox5 and Sox6 therefore appeared to be 
predominant CSEP components. 

Altogether these results indicated that three different 
Sox proteins, L-Sox5, Sox6 and Sox9, were present in 
chondrocytes and bound to the 48 bp Col2al element. 
L-Sox5 is a long product of the Sox5 gene that has not 
been identified previously. 

L-SoxS, Sox6 and Sox9 contact several HMG-tike 
sites in the 48 bp Co 12a 1 enhancer 

The Col2al 48 bp enhancer contains four HMG-like sites, 
each containing 5 or 6 bp of the 7 bp consensus HMG 
site (Figure IF). Previously we showed that mutations 
that disrupted any one of sites 1-3 inhibited enhancer 
activity in chondrocytes in transgenic mice and partially 
inhibited binding of CSEP to the enhancer (mutants mAl , 
mA4 and mA6 in Figure IF; Zhou et aL, 1998). A 
mutation that disrupted site 4 also impaired binding of 
CSEP to the enhancer (mutant mA9 in Figure IF). The 
effect of this mutation on enhancer activity has not been 
tested. A mutation that preserved all five HMG consensus 
nucleotides of site 2 did not affect the binding of CSEP 
(mA3 in Figure IF) and only weakly inhibited enhancer 
activity in RCS cells (Zhou et ai 9 1998). A mutation that 
disrupted three sites completely inhibited binding of CSEP 
(mA8 in Figure IF; Zhou et al, 1998). We also reported 



previously that subfragments of the 48 bp element con- 
taining either one of the four HMG-like sites alone were 
unable to bind CSEP or to compete with the 48 bp element 
for binding of CSEP (Lefebvre et al> 1996, 1997; Zhou 
et al, 1998; our data, not shown). These results indicated 
that the four sites of the 48 bp element cooperatively 
contributed to formation of the CSEP-enhancer complex. 

As described previously (Lefebvre et ah, 1991 \ Zhou 
et al y 1998), the faster-migrating SOX9-DNA complex 
was inhibited by mutation of site 3 but not by mutation 
of other sites (Figure IF). In contrast, the slower-migrating 
SOX9-DNA complex was partially inhibited by mutations 
disrupting any of the four sites and was completely 
inhibited by mutation of several sites. These results are 
consistent with the notion that the faster-migrating com- 
plex was formed by binding of one molecule of SOX9 to 
site 3, whereas the slower-migrating complex was formed 
by cooperative binding of two or more SOX9 molecules 
to several sites on each DNA molecule. 

Consistent with the results obtained with CSEP, binding 
of L-Sox5 or Sox6 to the 48 bp enhancer was partially 
inhibited by mutations that disrupted any one of the four 
HMG-like sites and completely inhibited by a mutation 
of several sites (Figure IF). Mutation mAl slightly but 
reproducibly inhibited binding of L-Sox5 and Sox6 to 
DNA. As seen with CSEP, L-Sox5 and Sox6 were unable 



5721 



V.Lefebvre, P.U and B.de Crombrugghe 



to bind to DNA probes harboring only one of the Col2al 
HMG-like sites (data not shown). Taken together, these 
data indicated that L-Sox5 and Sox6 were able to contact 
cooperatively all four HMG-like sites of the CoUal 48 bp 
enhancer. 

In chondrocyte nuclear extracts, the faster-migrating 
complex of Sox9 with the 48 bp enhancer was not seen 
(Zhou et al, 1998; compare also extracts of RCS cells 
and extracts of SOX9-transfected 10T1/2 cells in Figure 
ID), but antibody supershift experiments indicated that 
Sox9 was present in the slower migrating CSEP complex. 
This result strongly suggested that, in chondrocytes, Sox9 
bound the 48 bp element not as a single molecule but in 
cooperativity with other Sox9, L-Sox5 or Sox6 molecules. 

These data suggest a model whereby the four HMG- 
like sites of the 48 bp element and the three Sox proteins 
could participate in vivo in the formation of a large 
protein-enhancer complex that would activate expression 
of CoUal in chondrocytes. Consistent with this model in 
which mutation of any of the HMG-like sites would 
dismantle the protein-enhancer complex, we have 
observed that disruption of any of sites 1 , 2 or 3 resulted 
in abolition of the activity of the enhancer in chondrocytes 
of transgenic mice (Zhou et al, 1998). 

A search for other HMG consensus and HMG-like sites 
(with six nucleotides of the heptamer consensus) in 1 kb 
of CoUal promoter sequence and in a 468 bp element of 
the first intron (+1 8787+2345), which acts as a strong 
chondrocyte-specific enhancer in transgenic mice (Zhou 
et al, 1995), revealed the presence of a few scattered 
sites, but no cluster of two or more binding sites for Sox9, 
L-Sox5 and Sox6 was found that resembled that of the 
48 bp element (data not shown). 

Long transcripts of the Sox6 and Sox5 genes are 
expressed in chondrocytes 

Sox5 and Sox6 were previously shown to be highly 
expressed in the testis of adult mice (Denny et al., 
1992; Connor et al, 1995; Takamatsu et al, 1995). The 
transcripts were ~2.0 and ~3.2 kb long, respectively. 
Traces of longer transcripts (~10 kb) were described for 
Sox6 in immature mouse testis (Takamatsu et al, 1995) 
and several tissues of adult mice (Connor et al, 1995), 
and also for SOX5 in human fetal brain (Wunderle 
et al, 1996). 

In Northern blots of total RNA, the short transcripts of 
Sox6 (3.2 kb) and Sox5 (2 kb) were abundant in adult mouse 



testis, as expected (Figure 2A). These short transcripts were 
not expressed (or were expressed at a very low level in 
the case of Sox6) in chondrocytes or in any cell line or 
newborn mouse tissue examined (Figure 2A; data not 
shown). A 6.3 kb Sox5 transcript and a 7.7 kb Sox6 
transcript were found in similar relative abundance in 
primary chondrocytes from ribs of newborn mice, as well 
as in RCS and early-passage MC615 cells (Figure 2A and 
B). These three chondrocyte cells were previously shown 
to be well differentiated, expressing CoUal at a high 
level in parallel with Sox9 (Lefebvre et al, 1997). When 
rib chondrocytes were allowed to dedifferentiate by two 
passages in monolayer culture, Sox5 and Sox6 expression 
sharply declined (Figure 2A), as did CoUal and Sox9 
expression (Lefebvre et al, 1997). Transcripts of Sox5, 
Sox6 and Sox9 were found in some non-chondrocytic cell 
types, but in contrast to chondrocytes, none of the non- 
chondrocytic cells coexpressed the three Sox genes or 
expressed CoUal at high levels (Figure 2B; Lefebvre 
et al, 1997). 

The long transcripts of Sox5 and Sox6 were present in 
the brain and some other non-cartilaginous tissues of 
newborn mice (Figure 2A and C). However, no non- 
cartilaginous tissue, besides brain and testis, was found to 
coexpress Sox9, Sox5 or and Sox6 (Figure 2A and C; 
Lefebvre et al, 1997). In testis, Sox9 expression is 
restricted to the somatic Sertoli cells (Kent et al, 1996; 
Morais da Silva et al, 1996); Sox5 is expressed as a short 
transcript in post-meiotic germ cells, mostly in round 
spermatids (Denny et al, 1992); and, since expression of 
Sox6 correlates with that of the protamine gene (Takamatsu 
et al, 1995), a marker of spermatid differentiation, it is 
possible that Sox6 is expressed in testis in the same cells 
and at the same time as Sox5. The three Sox proteins, 
therefore, do not appear to be expressed in the same cells 
in testis. 

In conclusion, chondrocytes, and perhaps some brain 
cells, appear to express together long transcripts for Sox5 
and Sox6 and transcripts for Sox9, Expression of these 
transcripts correlates with expression of CoUal in 
chondrocytes. 

The long transcripts of Sox5 and Sox6 are 
coexpressed with Sox9 and Co 12a 1 during 
chondrogenesis in mouse embryos 

Previous whole-mount in situ experiments showed that 
Sox6 was expressed in the developing nervous system 



Fig. 3. Transcripts for L-Sox5, Sox6 and Sox9 are coexpressed with Col2al in chondrogenesis in mouse embryos. (A) In situ hybridization of 
sections through 10.5-day-old mouse embryos. A dark-field picture of a section hybridized with CoUal probe was inverted to show the following: 
bal-3, first to third branchial arches; fb, forelimb bud; hb, hindlimb bud; hm, head mesenchyme; ne, neural epithelium; no, notochord; nt, neural 
tube; ot, otic vesicle; and sc, sclerotome. All these sites expressed transcripts of type U collagen, L-Sox5, Sox6 and Sox9, as shown in dark-field 
pictures. Expression of CoUal in hindlimb buds is not yet seen in 10.5-day-old embryos, in contrast to expression of the three Sox genes. (B) In situ 
hybridization of sagittal sections through 12.5-day-old mouse embryos. Dark-field pictures of sections hybridized with CoUal and Sox6 probes were 
inverted to show skeletal and non-skeletal structures; h.t.c, cartilage primordium of hyoid bone, thyroid and cricoid cartilages. Dark-field pictures 
show expression of transcripts for type II collagen, L-Sox5, Sox6 and Sox9 in these areas. (C) In situ hybridization of longitudinal sections through 
a forelimb of a 15.5-day-old mouse embryo. Adjacent sections were hybridized with collagen and Sox RNA probes. Coll Oal, gene for al(X) 
collagen; CoUal, gene for pro-ccl(I) collagen. Arrows point to the diaphysis of cartilages in the metacarpus, where chondrocytes become 
hypertrophic before giving way to osteoblasts. In these areas, chondrocytes are downregulating expression of the Sox genes and activating expression 
of Coll Oal. A bracket underlines the diaphysis of the ulna, where ossification is more advanced than in the metacarpals: at each extremity, 
hypertrophic chondrocytes have turned off expression of the Sox genes and are switching from CoUal to Coll Oal expression; in the center, 
osteoblasts are expressing CoUal. (D) In situ hybridization of longitudinal sections through a hindlimb of a 17.5-day-old mouse embryo. Adjacent 
sections were hybridized with collagen and Sox RNA probes. Brackets show different zones of cartilage and bone in the tibia. Chondrocytes in 
hyaline cartilage and proliferating chondrocytes in growth plates actively express CoUal and transcripts for L-Sox5, Sox6 and Sox9; at a later stage 
of differentiation, hypertrophic chondrocytes no longer express the Sox genes, and CoUal expression is being progressively replaced by Coll Oal 
expression. Osteoblasts and cells in surrounding tissues are expressing Collal. 



5722 



Col2a1 activation by L-Sox5, Sox6 and Sox9 



of mouse embryos; expression was high in early-stage 
embryos, but disappeared by embryonic day 12. 5 (Connor 
et al, 1995). To obtain information on Sox6 and Sox5 
expression during chondrogenesis in mouse embryos, we 
performed a series of in situ experiments and compared 
the expression patterns of these two Sox genes with those 
of Sox9 and Collal. For Sox5, we chose a probe that 
recognized the long transcript of Sox5 but not the short 



transcript, since we were interested in sites of expression 
of the L-Sox5 protein only. For Sox6, two probes were 
tested that recognized either the long transcript only or 
both the long and short transcripts, since both transcripts 
appear to encode the same protein (Connor et al. y 1995; 
Takamatsu et al. 9 1995); the two probes gave identical 
results (data not shown). 

At about mid-stage embryogenesis (day 10.5 post- 



B J0%ts Col2al\ 




WSox5\ 



Sox6\ 



C6l2al 



ColWal 



Collal 




Sox6 



Sox9 



Col2al L-SoxS $0x6 



$ox9 CollOal Collal 



hyaline cart 
prolifer. ch> 

hypertoch. 
bone 

hypertr ck 

prolifer, ch, 
hyaline cart 




5723 



V.Lefebvre, P.Li 'and B.de Crombrugghe 



coitum), mesenchymal cells form prechondrocytic con- 
densations at different sites in the embryo. The sclerotomal 
components of somites contain precursor cells for the 
axial skeleton, the head mesenchyme and first and second 
branchial arches will generate part of the craniofacial 
skeleton, and the lateral plate mesoderm and limb buds 
will give rise to the appendicular skeleton. Transcripts for 
L-Sox5 and Sox6 were found in all these sites together 
with Col2al and Sox9 RNAs (Figure 3A; Cheah et a/., 
1991; Wright et al, 1995; Ng et al, 1997; Zhao et al, 
1997). The four RNAs also co-localized in some non- 
chondrogenic sites such as brain, neural tube, otic vesicles 
and notochord. 

At day 12.5 of embryonic development, cartilage is 
actively forming in all future cartilaginous and endo- 
chondral skeletal structures, including cartilages of the 
nose and ear, the thyroid, cricoid and hyoid cartilages, 
Meckel's cartilage, cartilages of the base of the skull, ribs, 
vertebrae, forelimbs and hindlimbs (Figure 3B). High 
levels of transcripts for L-Sox5 and Sox6 were found in 
all these structures, together with Collal and Sox9 RNAs. 
Expression of the four genes was also seen in some areas 
of the brain and spinal cord. Transcripts for Sox6 and 
L-Sox5, but not Sox9, were also visible in liver and a few 
other non-cartilaginous areas. 

In the limbs of 15.5- and 17.5-day-old embryos (Figure 
3C and D), expression of Col2al and of the three Sox 
genes was high in the cartilaginous templates of the 
radius and ulna, carpals, metacarpals and tibia, and in 
proliferating chondrocytes of growth plates. When 
chondrocytes became hypertrophic in growth plates, they 
activated expression of CollOal whereas RNAs, for all 
three Sox proteins disappeared rapidly and simultaneously 
and RNAs for Collal disappeared more slowly. Only 
traces of Col2al and Sox RNAs were found after cartilage 
was replaced by bone and expression of Collal was 
activated in osteoblasts. 

In conclusion, RNAs for collagen II, L-Sox5, Sox6 and 
Sox9 were expressed simultaneously and at high levels 
from early stages of chondrogenesis in all cartilaginous 
sites in mouse embryos. Expression of the three Sox genes 



appeared to be inhibited just before Collal expression in 
hypertrophic chondrocytes. These results are consistent 
with a role for all three Sox proteins in the activation of 
Collal in chondrogenesis, and possibly also in the activ- 
ation of other genes of the chondrogenic program. 

Comparison of Sox5, L-SoxS and Sox6 cDNAs and 
polypeptides 

Connor et al (1995) and Takamatsu et al (1995) pre- 
viously isolated Sox6 cDNA species from adult mouse 
testis cDNA libraries. The cDNAs reported by the two 
groups encoded slightly different proteins, called Sox6 
and Sox-LZ, respectively For convenience, we have 
renamed them Sox6A and Sox6B. The proteins differ in 
two segments, most likely encoded by alternatively spliced 
exons (Figure 4A). Segment 1 encodes a MAAAA amino 
acid sequence (SI A) in Sox6A and a SSAAA sequence 
(SIB) in Sox6B, with different A codons being used in 
SI A and SIB. Segment 2 encodes a 4 1 -amino-acid 
sequence and is present only in Sox6A. When total cDNA 
from primary chondrocytes was used in the polymerase 
chain reaction (PCR) with primers flanking the region of 
the SI and S2 segments, the amplified products had the 
size expected for Sox6A (478 bp) and Sox6B (355 bp) 
products (data not shown). Sequencing of the PCR 
products indicated that the longer product contained SI A 
and S2, and was identical to Sox6A. The shorter product 
contained SI A and lacked S2. It therefore corresponded 
to a new form of Sox6, which we have named Sox6C 
(Figure 4A), In DNA-binding and transactivation assays, 
the three forms of Sox6 displayed identical activities (data 
not shown). These PCR data and additional sequencing 
of Sox6 cDNA clones from chondrocyte cDNA libraries 
suggested that the short and long transcripts of Sox6 both 
encoded protein isoforms with minor differences. Sox6 
cDNA clones obtained from chondrocyte libraries con- 
tained an additional 3' untranslated sequence at the 3' 
end of the previously reported cDNAs from testis (data 
submitted to DDBJ/EMBL/GenBank). The longest 
sequence was 1 .5 kb and accounts in part for the difference 
between the long and short tenscripts of Sox6. 



Fig. 4. Comparison of Sox5» L-Sox5 and Sox6 cDNAs and polypeptides. (A) Schematic comparison of Sox6A, Sox6B and Sox6C polypeptides. 
Three forms of Sox6 differed in segments SI and S2. Sox6A and Sox6B are schematized according to the sequences reported by Connor et al. 
(1995) and Takamatsu et al. (1995), respectively. Sox6C, whose cDNA sequencing has not been completed, is schematized based on the assumption 
that it is identical to Sox6A and Sox6B outside of segments SI and S2. Two coiled-coil domains (1st cc and 2d cc) and the HMG DNA-binding 
domain, are indicated. SI A and SIB segments are compared at the nucleotide level and, in parentheses, at the amino acid level. (B) Comparison of 
Sox5 and L-Sox5 cDNAs. Sequences available for Sox5 and L-Sox5 cDNAs are presented as blocks. Shaded areas indicate coding sequences, 
between the first in-frame ATG codon and the next stop codon. Sox5 and L-Sox5 cDNAs are identical in the segment delineated by the two dotted 
lines, but they differ totally on either sides of this segment. Numbers refer to nucleotide positions relative to the 5' end. The Sox5 cDNA 
representation follows the sequence published by Denny et al. (1992). L-Sox5 cDNA is schematized according to a 3881 bp sequence obtained from 
overlapping clones. Both cDNAs must lack 5' and/or 3' untranslated sequences, as RNA transcript lengths were estimated to be ~2 kb for Sox5 and 
~6.3 kb for L-Sox5. (C) Alignment of L-Sox5 and Sox6A amino acid sequences. The sequence of L-Sox5 (upper sequence), derived from its cDNA 
sequence, is compared with that of Sox6A (lower sequence) (Connor et al., 1995). Numbers refer to amino acid positions in the proteins. Sequence 
alignment was generated using the GAP program from the Genetics Computer Group (GCG, Madison, WI). Dots were introduced within sequences 
to maximize alignment. Vertical lines denote amino acid identities. Double and single dots between sequences indicate amino acid changes with high 
and low degrees of conservation, respectively. The methionine translation initiation codon of the short form of Sox5 is circled in the L-Sox5 
sequence (residue 288). Boxes outline two potential coiled-coil domains and the HMG DNA-binding domain. The sequence of Sox6A spanning 
residues 330-379 was omitted because it has no counterpart in L-Sox5. (D) Schematic comparison of Sox5, L-Sox5 and Sox6A polypeptides. Each 
protein is represented as a block between its N- and C-termini. HMG, HMG DNA-binding domain; cc, potential coiled-coil domain. Hatched 
boxes represent regions of >10 residues in Sox6A that do not exist in L-Sox5. Dotted lines link regions of similarity between the proteins. 
Numbers refer to the position of the residues marking domain limits. The predicted molecular weight of each protein is given in parentheses. 
(E) Delineation of potential coiled-coil domains in L-Sox5 and Sox6A. The amino acid sequences of L-Sox5 and Sox6A were analyzed with the 
Coilscan program from GCG using an unweighted matrix at a window size of 28 residues. The probability of each residue participating in coiled- 
coil formation is plotted against its position in the sequences. A score of 1.0 indicates a maximum probability. The first potential coiled coil involves 
residues 158-240 in L-Sox5 and residues 181-262 in Sox6A; the second potential coiled coil spans residues 364-403 in L-Sox5 and 488—515 in 
Sox6A. 



5724 



Col2a1 activation by L-Sox5, Sox6 and Sox9 



The L-Sox5 cDNA that was reconstituted from over- 
lapping clones contained a 2037 bp open reading frame 
that encodes a 679-amino-acid protein, with 287 amino 
acids N-terminal to, and in frame with, the 392-amino- 
acid Sox5 protein sequence (Denny et ai, 1992) (Figure 
4B-D). It contained 189 bp of 5' untranslated sequence, 
with an in-frame stop codon 120 bp upstream of the first 
ATG (data not shown). The first three in-frame methionine 
codons are surrounded by sequences homologous to the 
Kozak consensus and are therefore putative translation 
initiation codons (data not shown). Sox5 (Denny et al. 9 
1992) and L-Sox5 cDNAs share an uninterrupted 1571 bp 
sequence that includes the 1179 bp coding sequence of 



Sox5 (Figure 4B). Upstream and downstream of this 
region, the sequences available for the two cDNAs are 
totally different. Sox5 and L-Sox5 mature RNAs must 
therefore arise from differential splicing of primary tran- 
scripts in the 5' and 3' regions, and different promoter 
usage for the two RNAs cannot be excluded. 

Previous comparison of the short Sox5 protein with 
Sox6A (Connor et al, 1995) indicated an overall high 
degree of identity (61%), with a maximum of 93% in the 
HMG DNA-binding domain (Figure 4C and D). By 
comparison, the HMG domains of Sox5 and Sox6 are 
only 54 and 53% identical to that of Sry, respectively, and 
50% and 49% identical to that of Sox9. The N-terminal 



Sox6A N J- 



S1A S2 
Jit 327/3*7 
lstcc V ^ 2d« 



Sox6B 
(S6kDa) 



± 



Sox6C N J~ 
<UkDa) T- 



SI A: ATG GCA OCT OCT OCT (MAAAA) 
SIB; .OT T.. ..A ..A ..A (SSAAA) 



B 



L-Sox5 I I l^^^ljyK^^>itf>r^fc^^ : j•^>•^Ji :.wau^.vs^*>.^!^ 



-C * s^poiyA 



L-Sox5 l MiADPiaMKrKiatssxuj>sryGrr.DOKVUi...vTSXQXvstxssn tc 

Q , A mi- u» i mi n -iiiii ii h- 




I i-i- It- • i ii- •iniiiiihh i . iitiiiiiii 




nun mi -ii i i nil i i-i i ui ii. ii inn' 

3M AnoerLFBiiQiTTKDa riinwii|iiiiiwiiim» PTQTQTffn? 329 



L_Sox5 N-Fi — r 



2,1 n - I M 



m kDi) 



Sox5 
(13 kDa) 



HMG 



77 116 182 261 



158 240 233 301 364 403 450H69 548 



S3 301 3*4 403 450*469 548 67» v 

\\\ \ \ 

It \ 2dcc ^ HMG \ 

IkWN 



H 



1 44 75 151 262 312 J23 417 457 515 577 597617 696 




*oj ib 

Tl inilllltl |.| H I .|.M |..| Mil 

330 H* w w mm w «nu»qtfl^ff ffxptra-mqro-f nr-T*FWPJ«0 m 57J 



I I.|-I.||..)ll I 



nnittiiiiti] 

1 ?*w»vw 



451 
SIO 

413 Iw TO ^LOwapntWBd w^ ^ 

roi^!l! MntMMM IIJUIIM> 1:1 M ill III 1 1 II 

— * « -~ " k " ttptp FiftK takrt jttjg 



in 



^tmiiflftUlHIl MM 

«I0 WKJwmr — 



1111111*1 mi 




ISSnVOMPVXQSTTO 630 
I - 1 1 1 1 'llllllll 

7T» 

DOftOQU 671 

MM I II* l>M II I*. I II 

I t36 



L-Sox5 



t . » . I . t . I . I . I 



0 100 200 300 400 500 400 



residue number 



SOX6A 



I • 1 • I . 1 . I 



= £- 0,5 - 
82 

it : 



5725 



V.Lefebvre, P.Li and B.de Crombrugghe 



segment of L-Sox5 also has a high degree of sequence 
identity (72%) with the N-terminus of Sox6 (Figure 4C 
and D). Computer analysis of the protein sequences 
revealed two regions compatible with formation of coiled 
coils in L-Sox5 and Sox6 (Figure 4E) (Lupas, 1996). The 
more upstream coiled-coil domain is highly conserved 
between L-Sox5 and Sox6 (91% identity) and has a 
maximum probability score for coiled-coil formation of 
1 .0 in both proteins. It is particularly long, with 83 residues 
in L-Sox5 and 82 in Sox6, which implies that the coiled 
coil may complete almost 24 helical turns. It includes in 
its N-terminal part the leucine zipper previously identified 
in Sox6 (Connor et ai, 1995; Takamatsu et al, 1995) and 
a glutamine-rich domain in its C-terminal part, referred 
to as a Q box (Kido et al, 1998) (Figure 4C). The more 
downstream coiled-coil domain is located in the sequence 
of L-Sox5 that is shared with Sox5. This domain is shorter, 
with 40 residues in Sox5 and 29 in Sox6. It has a maximum 
probability score of 1.0 in Sox5 but a low-probability 
score of 0.39 in Sox6. It is only moderately conserved 
between the two proteins (59% identity). No region of 
homology with Sox9 was found outside the HMG domain. 

In conclusion, the striking identity of L-Sox5 and Sox6 
in the first coiled-coil domain and in the HMG domain 
strongly suggests that each of these domains must serve 
one or several important functions. The presence of these 
domains in both proteins and the otherwise overall high 
degree of identity between the two proteins (67%) suggest 
that L-Sox5 and Sox6 may play similar roles in vivo, 

Dimerization of L-SoxS and Sox6 highly stabilizes 
binding to adjacent HMG sites on DNA 

L-Sox5 and Sox6 bound to the 2HMG probe much more 
efficiently than to the 1HMG probe (Figure 5A). In 
contrast, SOX9 bound with similar efficiency to both 
probes. This observation, the presence of potential coiled 
coils in L-Sox5/Sox6 and the ability of Sox6 to homo- 
dimerize (Takamatsu et al., 1995) led to the hypothesis 
that dimerization of the proteins through their coiled 
coils might stabilize their binding to adjacent DNA- 
binding sites. 

To test this hypothesis, various deletion forms of L-Sox5 
were synthesized in vitro and treated with glutaraldehyde 
(Figure 5B), which has the ability to cross-link interacting 
polypeptides (Wong, 1991). After separation by SDS- 
PAGE (Figure 5C), polypeptides with the expected M r 
were seen for each full-length and deleted protein species 
in samples treated with no glutaraldehyde. Upon treatment 
with glutaraldehyde, monomelic forms of Sox6 and 
L-Sox5 were less abundant, and polypeptide species 
appeared that had the apparent M x of homodimers. 
Similarly, the 151/679 and 32/437 deletions formed cross- 
linked species with an M r consistent with dimerization. In 
these deletions, the two coiled-coil domains were intact. 
Deletion 213/679, short Sox5 and deletion 32/221-304/ 
437, all of which had the first coiled-coil domain partially 
or totally deleted, did not form specific cross-linked 
species, nor did SOX9. These results indicated that the 
first coiled-coil domain of L-Sox5, like that of Sox6 
(Takamatsu et al., 1995), was involved in specific protein- 
protein interactions, most likely protein homodimerization. 
The second coiled-coil domain of L-Sox5, which is present 
in 213/679, short Sox5 and deletion 32/221-304/437, 



appeared to be unable to mediate protein dimerization by 
itself. It might, however, have a role in stabilizing protein 
dimerization through the first coiled-coil domain. 

In EMSA (Figure 5D), deletion 151/679 bound DNA 
as efficiently as L-Sox5 (lanes 2 and 3), but deletion 213/ 
679 did not (lane 4), nor did Sox5 (lane 5). The first 
coiled-coil domain of L-Sox5 (residues 158-240) was still 
intact in 1 5 1 /679 but largely deleted in 2 1 3/679. It appeared 
therefore to be involved in the high affinity of L-Sox5 for 
the 2HMG probe. A protein truncated in the C-terminus 
(32/437) did not bind DNA (lane 6), an expected result 
given that the HMG domain was deleted. 

When L-Sox5 was mixed with deletion 151/679, the 
respective complexes of the two proteins formed, but a 
third abundant complex also formed whose migration 
level indicated that it was likely to involve one molecule 
of each of the two proteins (lane 9). A similar intermediate 
complex also formed with Sox6 and 151/679 (lane 15). 
This result strongly suggested that L-Sox5 binds the 
2HMG probe as a homodimer or heterodimer with Sox6. 
On the basis of its DNA-binding properties, a similar 
conclusion can be made for Sox6. 

When L-Sox5 was mixed with deletion 213/679 (lane 
10), only the complex of L-Sox5 homodimer with DNA 
formed, indicating that the coiled-coil domains of two 
molecules of L-Sox5 were needed for efficient binding to 
DNA. Also, when L-Sox5 was mixed with deletion 32/ 
437 (lane 12), no heterodimers bound to the 2HMG probe, 
indicating that the two C-termini of L-Sox5 molecules 
(which included the HMG domain) were required to 
bind DNA. 

Together, these results suggest a model (Figure 5E) in 
which binding of two molecules of L-Sox5 and Sox6 to 
adjacent HMG sites on DNA is highly stabilized by 
protein dimerization through the coiled-coil domains. 

Cooperative activation of chondrocyte-specific 
Col2a1 constructs by L-Sox5, Sox6 and SOX9 

The ability of L-Sox5, Sox6 and SOX9 to activate CoUal 
reporter constructs in non-chondrogenic cells was tested 
using constructs in which CoUal enhancer segments were 
placed in an intron downstream of the promoter, as is the 
case in the endogenous CoUal gene. 

A construct with a 309 bp Collal promoter (p309; 
Figure 6A) was previously shown to be inactive in 
chondrocytes of transgenic mice. (Zhou et al., 1995). In 
transient transfection of 10T1/2 cells, this construct was 
slightly but reproducibly activated by L-Sox5/Sox6 but 
not by SOX9, and no cooperation occurred among the 
three Sox proteins (Figure 6B). A similar construct that 
included four tandem copies of the 48 bp enhancer 
[p309-(4x48); Figure 6 A] was specifically expressed in 
chondrocytes both in transgenic mice (Lefebvre et ai, 
1996) and in transient transfection (data not shown). In 
non-chondrogenic cells, this construct was not activated 
by L-Sox5, Sox6 or a combination of L-Sox5 and Sox6 
at a higher level than the p309 construct, and it was 
moderately activated by SOX9 (Figure 6C). Interestingly, 
when SOX9 was co-expressed with L-Sox5, Sox6, or both 
L-Sox5 and Sox6, a higher activation occurred than 
with any Sox protein alone, demonstrating cooperativity 
between SOX9 and L-Sox5/Sox6. 



5726 



Col2a1 activation by L-Sox5, Sox6 and Sox9 




D - L-Sox5 Sox6 E 




1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 



Fig. 5. Dimerization of L-Sox5 and Sox6 stabilizes binding to adjacent HMG sites. (A) L-Sox5 and Sox6 bind the 2HMG probe more efficiently 
than the 1HMG probe. Extracts of 10T1/2 fibroblasts transfected with empty (-), SOX9, L-Sox5 or Sox6 expression piasmids were incubated with 
the 1HMG probe or the 2 HMG probe. The two probes had the same radioactivity. Arrow, L-Sox5-Sox6 complexes with DNA and slow-migrating 
SOX9-DNA complex. Arrowhead, fast-migrating SOX9-DNA complex; this complex was often seen as a doublet. (B) Deletions in L-Sox5. L-Sox5 . 
full-length protein is schematized as described in Figure 4D. Truncated proteins were named by the first and last residues of L-Sox5 that they 
contain. In 32/221-304/437, the sequence 222/303 in 32/437 was deleted. (C) The first coiled-coil domain of L-Sox5 and Sox6 is involved in 
protein-protein interactions. Sox6, L-Sox5, truncated L-Sox5 polypeptides and SOX9 were synthesized in vitro with [ 35 S] methionine and then 
incubated with (+) or without (-) glutaraldehyde. Autoradiographs are shown after polypeptide separation by SDS-PAGE. Arrowheads, 
glutaraldehyde-induced cross-linked polypeptides. The M x of protein standards is indicated (X10 3 ). (D) L-Sox5-Sox6 bound the 2HMG probe as 
dimers. Extracts of 10T1/2 fibroblasts transfected with empty (-), L-Sox5 or Sox6 expression piasmids were preincubated with products of in vitro 
transcription/translation obtained with piasmids encoding no protein (-) or one of the truncated L-Sox5 proteins (indicated on top of the lanes). 
EMSA was performed with the 2HMG probe. Arrows, complexes of L-Sox5 and Sox6 with DNA. Arrowhead, complex of 151/679 with DNA. Star, 
heterocomplexes of 151/679, and L-Sox5 or Sox6 with DNA. (E) Model. Two molecules of L-Sox5/Sox6 (ovals) form a highly stable complex with' 
DNA upon binding to two adjacent recognition sites on DNA (thick curved line). The two protein molecules dimerize through their coiled-coil 
domains and may induce a strong bend of DNA at the sites of binding. 



Similarly, a p309-(2x231) construct, which harbored 
two copies of a 231 bp Cottal chondrocyte-specific 
enhancer (Lefebvre et ai, 1996), including the 48 bp 
element, was activated at a higher level by the three Sox 
factors together than by each Sox protein individually 
(Figure 6D). Deletion of a 10 bp sequence in the 231 bp 
enhancer, which corresponded to the 3' end of the 48 bp 
element, abolished enhancer activity in chondrocytes 
(Lefebvre et aL, 1997) and also transactivation by the 



three Sox factors (Figure 6D). In contrast to L-Sox5/Sox6, 
Sox4 and the short form of Sox5 did not activate the 
p309-( 2X231) construct, nor did they cooperate with 
SOX9 to generate a high-level transactivation of the 
construct in fibroblasts (Figure 6E). 

Together these results indicated that L-Sox5 and Sox6 
significantly enhanced transactivation of CoUal constructs 
by SOX9. L-Sox5 and Sox6 appeared to function similarly, 
whereas the short form of Sox5 was inactive. 



5727 



V.Lefefcvre, P.U and B.de Crombrugghe 



activity in 
chondrocytes 



B 



Is 



p309 

p309-(4x48) 

p309-(2x231) 

p309-(2x221) 

C 

p309 




p309-(4x48) 



I 3; 



p309-(2x231) P 309-(2x221) 



p309 



.ililll *■ J^lA iil iJll 

i2ig88?8§ i388S ■ ■ 35 S5 3 8 85 • 35 S5 2 8 8 

W * vo vj + + 

« $ +S9 

3-3 3 



p309-(2x231) 



Fig. 6. Cooperative activation of chondrocyte-specific Collal promoter-enhancer constructs by L-Sox5, Sox6 and SOX9. (A) Schematic of CoUal 
constructs. The pgeo reporter gene was driven by a 309 bp (p309; -309/4-308) Col2al promoter. Collal enhancer segments were four tandem 
copies of the 48 bp intron-1 element (4x48), two tandem copies of a 231 bp element (2x231) or two tandem copies of a 231 bp element in which 
10 bp (triangles) were deleted (2x221). The deletion corresponded to the 3' end of the 48 bp element. Enhancers were cloned downstream of the 
Col2al exon 1 in an intron delimited by a splice donor (SD), the proximal 70 bp of the Collal intron 1 and a splice acceptor (SA). The and ' + * 
symbols indicate whether the constructs were inactive or active, respectively, in chondrocytes. (B) Activation of p309. 10T1/2 fibroblasts were 
transfected with p309 and 900 ng of expression plasmids. These included empty vector (-) and, as indicated, 300 ng of L-Sox5 (L5), Sox6 (S6) and 
SOX9 (S9) plasmids. (C) Activation of p309-{4x48). Similar experiment as in (B) but with p309-(4x48) instead of p309. (D) Activation of p309- 
(2x231) and p309-(2x221). 10T1/2 fibroblasts were transfected with p309-(2x231) or p309-(2x221), and either 600 ng of empty (-) L-Sox5, Sox6 
or SOX9 plasmid, or 200 ng of each of the L-Sox5, Sox6 and SOX9 plasmids. (E) No activation of p309-(2x231) by Sox4 and Sox5. 10T1/2 
fibroblasts were transfected with p309 or p309-{2x231), and either 600 ng of empty (-), Sox4 (S4), Sox5 (S5), L-Sox5, Sox6 or SOX9 plasmid, or 
300 ng of SOX9 plasmid and 300 ng of empty or SOX plasmid. 



Cooperative activation of chondrocyte-specific 
genes by L-SoxS, Sox6 and SOX9 

The ability of L-Sox5/Sox6 and SOX9 to activate the 
endogenous CoUal gene was tested in several cell types 
by Northern blot analysis after transient transfection with 
Sox expression plasmids. As described previously 
(Lefebvre et al, 1997), 10T1/2 cells spontaneously 
expressed Collal , but at a much lower level than did 
differentiated chondrocytes (Figure 7 A). Following trans- 
fection of Sox plasmids, Sox RNAs and proteins accumu- 
lated in large amounts in the cells within 24 h, but rapidly 
disappeared during the next 48 h (Figure 7A). A significant 
increase in the CoUal mRNA level was observed 24 h 
after transfection of either SOX9 or L-Sox5/Sox6 plasmids 
(Figure 7A and B). Depending on the experiments, SOX9 
was either as potent as, or more potent than, L-Sox5/ 
Sox6. The increase in CoUal RNA was transient and no 
longer detectable after 48 h (Figure 7A), and was therefore 
concomitant with high levels of Sox proteins. The same 
results were observed whether L-Sox5 and Sox6 were 
tested alone or together, indicating that they had redundant 
activities (data not shown). No significant increase in 
CoUal expression was observed when the cells were 
transfected with an expression plasmid for Oct-1, a POU- 
domain transcription factor capable of binding to the 
48 bp CoUal enhancer (data not shown). This result 
demonstrated the specificity of Sox protein activity. 



Interestingly, coexpression of SOX9 and L-Sox5/Sox6 
resulted in a much higher activation of CoUal expression 
than when each Sox protein was expressed individually. 
Similar results were obtained with MC615 cells (Figure 
7B), which were tested after repeated passages in culture 
when expression of chondrocyte markers was severely 
reduced or lost. Their low level of CoUal expression was 
slightly stimulated upon expression of SOX9 alone or 
L-Sox5/Sox6, but it was strongly stimulated by coexpres- 
sion of the three Sox proteins. When the cell culture 
medium was supplemented with bone morphogenetic 
protein 2 (BMP-2), a cytokine known to promote chondro- 
genesis (Reddi, 1998), CoUal RNA level also increased. 
In this condition also, overexpression of the three Sox 
factors highly stimulated CoUal expression. When cells 
that did not spontaneously express CoUal were tested, 
such as skin primary fibroblasts from newborn mice, 
BALB/3T3 fibroblasts and COS cells, no induction of 
CoUal expression was detected following transfection of 
Sox protein expression plasmids (data not shown). 

Activation by L-Sox5/Sox6 and SOX9 of other genes 
expressed in chondrocytes along with CoUal was also 
examined (Figure 7B). The aggrecan gene, which encodes 
the protein core of a large aggregating proteoglycan found 
in abundance in cartilage, was weakly expressed by 
dedifferentiated MC615 cells, but overexpression of the 
three Sox proteins led to a clear increase in its RNA level. 



5728 



Col2a1 activation by L-Sox5, Sox6 and Sox9 



8 



■» — X*—Sox5' I*— Sox5 

Sox6 Sox6 
S0X9 - S0X9 

24 48 72 24 48 72 24 48 72 24 48 72 



B 



u 
a; 



BMF-2 

4. 4. - - -f L»Sox5 

.f + + + Sox6 

-4-- + - + - + SOX9 



EMSA 



IBS Jj 



$ox9f$OX9 I 





osUopontin 



MGP 



Fig. 7. Cooperative activation of chondrocyte-specific genes by L-Sox5, Sox6 and SOX9. (A) Transient transfection of 10T1/2 cells. L-Sox5, Sox6 
and/or SOX9 expression plasmids were transfected as indicated. Cells were harvested 24, 48 or 72 h after the start of transfection. Total cell extracts 
were used in EMSA with the 2HMG probe. Single-tailed arrow, L-Sox5-Sox6~DNA complexes; double-tailed arrow, fast-migrating SOX9-DNA 
complexes. The weak intensity of SOX9-DNA complexes compared with L-Sox5-Sox6-DNA complexes reflects a less-efficient binding of SOX9 to 
the 2HMG probe. Total RNA was hybridized in Northern blots with probes for 18S rRNA (reference for RNA loading), Sox9 and Col2aL 
Differentiated MC615 cells at an early passage are shown in the left lane as a reference for chondrocytic cells. Endogenous Sox9 RNA forms a band 
at the top of the panel; SOX9 RNAs expressed from the transfected plasmid run as an intense band migrating faster The more intense signal seen at 
the level of endogenous Sox9 RNA in cells transfected with SOX9 may be due either to increased expression of Sox9 or to larger transcripts from 
SOX9 plasmid. The exposure of the autoradiograph of the Col2al blot was eight times as short (3 h) for MC615 cells as for 10T1/2 cells. The 
intensity of hybridization signal for CoUal RNA in the different culture conditions is plotted as fold increase over the signal given by the control 
culture 24 h after transfection. (B) Transfection of MC615 cells. Cells that had essentially lost their chondrocytic phenotype by repeated passaging 
were transfected with Sox expression plasmids, as indicated. BMP-2 was added at the start of transfection. Cells were harvested 24 h later and RNA 
analyzed in Northern blot with various probes, as indicated. RNA from RCS cells was used as a reference for differentiated chondrocytes. The 
intensity of the CoUal hybridization signal in the different conditions is plotted as fold increase over the signal given by the control culture in the 
absence of BMP-2. Hybridization signals obtained with an 18S rRNA probe are shown as a reference for RNA loading. The exposure of the 
autoradiograph of the aggrecan blot was five times as short for RCS cells (3 h) as for 10T1/2 cells. 



RNA for Col9a2 t which codes for the a2 chain of type 
IX collagen, was undetectable in MC615 cells and not 
induced by any of the three Sox protein expression 
plasmids. RNAs for aggrecan and Col9a2 were not 
detectable in 10T1/2 cells, even after transfection of the 
three Sox protein expression plasmids (data not shown). 

RNA levels of genes that are expressed in chondrocytes 
but not in parallel with CoUal were examined in order 
to test the specificity of gene transactivation by the 
three Sox proteins. CollOal, a characteristic marker of 
hypertrophic chondrocytes, was expressed at low levels 
in MC615-dedifferentiated cells, but its expression was 
not affected by transfection of Sox protein expression 
plasmids. The genes for matrix Gla protein (MGP) and 
osteopontin, two extracellular matrix proteins produced 
by chondrocytes and also some other cell types, were 
expressed in MC615 cells, but transfection of the three 
Sox factors did not significantly affect their RNA levels 
(Figure 7B). 



In conclusion, L-Sox5/Sox6 and SOX9 were found to 
stimulate cooperatively expression of CoUal and also 
aggrecan, two major markers of chondrocytes. These data 
strongly suggest that the three sox proteins together control 
important aspects of the chondrocyte phenotype. 

Discussion 

We have shown here that a new form of Sox5 (L-Sox5) 
and Sox6, which both are dimeric Sox proteins that 
preferentially bind to adjacent HMG sites rather than to 
isolated sites, are coexpressed with Sox9 during chondro- 
genesis, efficiently bind to several HMG-like sites in 
the 48 bp chondrocyte-specific CoUal enhancer, and 
cooperate with SOX9 in activating the chondrocyte marker 
gene CoUal. 

L-Sox5 differs from Sox5 by an additional 287 amino- 
acid sequence at the N-terminus, L-Sox5 and Sox6 show 
a striking degree of identity in the HMG DNA-binding 



5729 



V. Lefebvre, P.U and B.de Crombrugghe 



domain located in the C-terminal part and in a coiled-coil 
domain located in the N-terminal part. This coiled-coil 
domain has been shown previously to mediate homo- 
dimerization of Sox6 (Takamatsu et aL, 1995), and we 
have shown that L-Sox5 also dimerizes through this 
domain, either with itself or with Sox6. Sox 12, Soxl3 
and Sox23 (Komatsu et aL, 1996; Kido et aL, 1998; 
Yamashita et aL, 1998) also feature a similar coiled-coil 
domain and their HMG domain is much more similar to 
those of L-Sox5 and Sox6 than to those of Sry, Sox9 and 
other Sox family members. Based on these similarities, 
the five Sox proteins are classified as Sox subgroup D 
(Wright et aL, 1993; Yamashita et aL, 1998). We have 
presented strong evidence that dimerization of L-Sox5/ 
Sox6 highly stabilizes binding to DNA at adjacent recogni- 
tion sites. Indeed, L-Sox5 and Sox6 bound to an element 
harboring two HMG sites much more efficiently than to 
an element harboring a single site. Two molecules of Sox 
polypeptides were binding to one molecule of 2HMG 
probe, and both the HMG and coiled-coil domains of the 
two polypeptide molecules were required for efficient 
DNA binding. SOX9, which does not homodimerize, did 
not bind to the 2HMG probe more efficiently than to the 
1HMG probe. The complementary roles of the coiled-coil 
and HMG domains of L-Sox5/Sox6 in DNA binding 
probably account for the high degree of conservation of 
both domains in D-Sox family members. It also implies 
that target genes for these Sox proteins must harbor pairs 
of HMG binding sites with a configuration compatible 
with binding of D-Sox protein dimers. 

Transcripts for L-Sox5 and Sox6 were expressed along 
with Sox9 in all prechondrogenic areas and cartilages 
during mouse embryonic development. Expression of all 
three Sox genes was inhibited when chondrocytes became 
hypertrophic in growth-plate cartilages; in these cells 
expression of the chondrocyte marker gene Col2al is also 
downregulated. Sox9, Sox5 (transcript for L-Sox5) and 
Sox6 were also expressed in some non-cartilaginous sites, 
such as notochord, otic vesicles and some areas of the 
brain, but they were not coexpressed where Collal was 
not expressed. In cell cultures, the three Sox genes were 
coexpressed only in primary chondrocytes and chondro- 
cytic cell lines, both of which highly expressed Collal, 
and a sharp decrease in the three Sox RNA levels accom- 
panied loss of Col2al expression during chondrocyte 
dedifferentiation. We have also shown that the three Sox 
proteins were present in chondrocytes. 

Our data indicate that when SOX9 and L-Sox5/Sox6 
were transfected individually in 10T1/2 and MC615 cells, 
they only produced a modest increase in expression of the 
endogenous Collal gene. However, upon cotransfection, 
cooperativity occurred among the three Sox proteins, 
leading to expression levels of CoUal that were an 
order of magnitude or more higher than in control cells. 
Expression of aggrecan was also significantly increased 
in MC615 cells transfected with the three Sox protein 
expression plasmids. The two classes of Sox proteins, 
L-Sox5/Sox6 and Sox9, are therefore able to cooperate 
functionally with each other in the activation of both 
CoUal and aggrecan. These results, together with the 
strong correlation that exists between expression of 
Collal, aggrecan (Glumoff et aL, 1994) and the three 
Sox genes during chondrogenesis in mouse embryos, 



support the notion that the three Sox proteins may also 
play a role in the activation of Collal , aggrecan and 
possibly other chondrocyte marker genes in vivo, 

A 48 bp element in the first intron of Collal was 
previously shown to be sufficient to direct expression of 
a reporter gene in cartilage of transgenic mice and to 
contain sites essential for the activity of longer Collal 
enhancer segments in chondrocytes (Lefebvre et aL, 1996; 
Bell et aL, 1997; Zhou et aL, 1998). This element is 
therefore likely to be involved in Collal expression in 
chondrocytes. Sox9, L-Sox5 and Sox6 bound to the 48 bp 
enhancer at four HMG-like sites, three of which were 
demonstrated to be necessary for enhancer activity in 
chondrocytes (Lefebvre et aL, 1996; Bell et aL, 1997; 
Zhou et aL, 1998), whereas the role of the fourth has not 
been tested. On their own, L-Sox5/Sox6 were weak 
activators of Collal enhancer constructs. However, they 
efficiently cooperated with SOX9 in the activation of the 
enhancer, leading to a level of activation that was several- 
fold higher than when each Sox protein was transfected 
individually. The promoter-enhancer configuration in 
these constructs was similar to that of the endogenous 
Collal gene, in which the chondrocyte-specific enhancer 
is located in the first intron downstream of the transcription 
start site. L-Sox5/Sox6 were able to contact all four HMG- 
like recognition sites of the 48 bp Collal enhancer. They 
probably bound as dimers, as their complexes with the 
enhancer and with the 2HMG probe migrated at the same 
level. Even though the binding sites for L-Sox5/Sox6 in 
the 48 bp enhancer are different from the preferred binding 
sites of the short Sox5 (AACAAT and AACAAAG; Denny 
et aL, 1992), our experiments indicate that L-Sox5/Sox6, 
as well as SOX9, efficiently bound to the enhancer. It is 
possible that the proximity of several HMG-like sites in 
the enhancer was favorable to cooperativity between the 
two types of Sox proteins in achieving transcriptional 
activation. 

Together the three Sox factors enhanced Collal and 
aggrecan expression in cells that expressed low levels of 
these genes, but they did not induce Collal expression 
in cells in which the gene was silent, even though they 
activated Collal constructs at high levels in transient 
transfection of these same cells. It is possible that addi- 
tional factors or coactivators may be needed to open the 
chromatin of chondrocyte-specific genes that were silent 
in the cells that were tested or that these genes were 
inactivated by other epigenetic mechanisms. Bell et al. 
(1997) reported that Sox9 was capable of activating 
Collal in some ectopic sites of transgenic mouse embryos. 
It is possible that the Collal gene is not repressed in 
mouse embryos as it might be in tissue culture cells or 
that the ectopic sites in which Sox9 was able to activate 
Collal contained factors that allowed Sox9 to activate 
Collal. 

Denny et aL (1992) reported that Sox5 was unable to 
activate transcription from a minimal promoter linked to 
multimerized Sox5 binding sites. In our experiments, 
L-Sox5/Sox6 were able to activate Collal constructs 
whereas Sox5 was not. The more efficient binding of 
L-Sox5/Sox6 to DNA, compared with that of Sox5, 
is sufficient to explain why L-Sox5/Sox6 transactivated 
whereas Sox5 did not. But it is also possible that the 
N-terminus of L-Sox5/Sox6 harbors one or more domains 



5730 



involved in transactivation. Takamatsu et al (1995) 
reported that full-length Sox6 was unable to transactivate 
a reporter construct containing four copies of an HMG- 
binding site. Transactivation and DNA binding occurred 
upon deletion of the leucine zipper of the protein. In our 
experiments, L-Sox5/Sox6 bound to the Collal enhancer 
and transactivated Col2al constructs as full-length pro- 
teins. Differences between the two studies may be due to 
differences in DNA targets. It may be that the sequence 
of the CoUal enhancer, which is a potential target of 
L-Sox5/Sox6, the distance between HMG-like sites and 
eventually the presence of binding sites for additional 
activators are essential for L-Sox5/Sox6 binding to DNA 
and transactivation function. 

HMG-domain proteins have been shown to participate 
as architectural factors in transcriptional activation of 
several genes (Grosschedl et al, 1994; Werner and Burley, 
1997). By their ability to bend DNA, a property also 
demonstrated for Sox5 and SOX9 (Connor et al, 1994; 
Lefebvre et al, 1997), they facilitate interactions between 
proteins bound at non-adjacent DNA sites and thereby 
promote the assembly of multiprotein-enhancer complexes 
(Giese et al, 1995; Pevny and Lovell-Badge, 1997). It is 
possible that L-Sox5/Sox6 have such a role, eventually 
with Sox9, in organizing an enhancer-protein complex 
and also in bringing this complex close to the basal 
transcriptional machinery, which in CoUal is 2.2 kb 
upstream of the enhancer. But the function of Sox9 is 
probably not limited to an architectural role, as it was 
shown to have a potent transactivation domain (Sudbeck 
et al, 1996; Lefebvre et al, 1997; Ng et al, 1997). 

The function of L-Sox5/Sox6 in vivo is not known. The 
ability of L-Sox5/Sox6 to cooperate with SOX9 in CoUal 
activation and the strong correlation between expression 
of L-Sox5/Sox6 and chondrogenesis suggest that mutations 
in the SOX5 and SOX6 genes might result in cartilage 
malformation diseases of still unknown causes. However, 
because of a possible redundancy of these two highly 
similar factors, a mutation in only one of their genes might 
cause only mild or no skeletal abnormalities. Although L- 
Sox5 and Sox6 belong to the same family of DNA-binding 
proteins as Sox9, and present the same expression pattern 
as Sox9 in chondrogenesis, it is unlikely that they play 
the same role as Sox9 because they differ substantially 
from Sox9 in DNA-binding and transactivation properties. 
The severe phenotype of patients with campomelic dyspla- 
sia, in which mutations in SOX9 are heterozygous, also 
strongly suggests that no other protein with a function 
that overlaps that of SOX9 exists in chondrocytes. Our 
data strongly suggest that Sox9 and L-Sox5/Sox6 represent 
two different subclasses of Sox proteins that are coex- 
pressed during chondrogenesis, where they have distinct, 
complementary roles in the activation of important chond- 
rocyte phenotype markers such as CoUal. 

Materials and methods 

cDNA cloning 

cDNA libraries were made from primary chondrocytes of ribs of newborn 
mice. Total RNA was isolated from cells cultured for 2-3 days (Lefebvre 
et al, 1994). Poly(A) + RNA was purified using the Poly(A) Quick 
mRNA isolation kit from Stratagene (La Jolla, C A). Synthesis of double- 
stranded cDNA and ligation of adaptors were performed using the 
Superscript Choice system from Gibco-BRL (Gaithersburg, MD). Prim- 



Col2a1 activation by L-Sox5, Sox6 and Sox9 

ing was performed with a mixture of oligo-dT and random hexamers. 
One expression library was made in the Xgtll phage vector and another 
in the XTriplEx phagemid vector from Clontech (Paolo Alto, CA). k DNA 
was packaged with Gigapack. Ill Gold Packaging extract (Stratagene). 
Southwestern screening was performed according to the method of 
Vinson et al. (1988) and Singh et al. (1989), and Clontech's instructions, 
after denaturation of filters with guanidine hydrochloride. Filters were 
preincubated for 30 min in incubation buffer (20 mM HEPES pH 7.9, 
10% glycerol, 0.1% Nonidet P-40, 0.5 mM EDTA, 1 mM PMSF, 
2 ug/ml leupeptin, 2 (xg/mJ pepstatin and 0.5 mM DTT) and further 
incubated for 2 h in new buffer, supplemented with 50 fmol/ml of 32 P- 
labeled 2HMG probe and 1 ug/ml poly(dG-<lC) or poly(dl-dC). Filters 
were washed four times for 1 min in incubation buffer and autoradio- 
graphed. New segments of L-Sox5 and Sox6 cDNAs were sequenced 
on both strands. A fragment of Sox6 cDNA was amplified by PCR using 
primers encompassing two Hindi restriction sites in the coding sequence 
and first-strand cDNA from mouse chondrocytes or adult testis. PCR 
products were electrophoresed in agarose gel, eluted and sequenced. 

Electrophoretic mobility shift assay (EMSA) 

EMS As were performed as described previously (Lefebvre et at., 1997) 
using a 1HMG or CoUal 48 bp probe (Lefebvre et al, 1997), or a 
2HMG probe. The latter probe consisted of two tandem repeats of the 
1HMG probe with BamYU and I -cleaved sites at the 5' and 3' ends, 
respectively, and with the two repeats linked by the sequence GGATCT. 
Extracts from transfected fibroblasts were made in 100 mM potassium 
phosphate buffer, pH 7.2, containing 0.2% Triton X-100. Proteins were 
synthesized in vitro using the TNT T7 Quick Coupled Transcription/ 
translation system (Promega, Madison, WI) and the expression plasmids 
described below. 

Antibody preparation and Western blotting 

Antisera were raised in rabbits (Genosys Biotechnologies, The 
Woodlands, TX) using keyhole limpet hemocyanin conjugated to a 
peptide homologous to the C-terminus of Sox9 (Bridgewater et al, 
1998) or corresponding to a sequence in the C-terminus of either Sox5 
(PDVDYGSDSENHIAG) or Sox6 (PKSDYSSENEAPEPV). Specific 
antibodies were purified as described previously (Bridgewater et al, 
1998). Purified antibodies (~0. 1 mg/ml) were dialyzed against phosphate- 
buffered saline. Western blots were prepared as described previously 
(Lefebvre et al, 1997) with Sox protein antibodies diluted 1:1000. 

Probes for Northern blotting and in situ hybridization 

A Nco\-Nru\ 552 bp fragment of mouse L-Sox5 cDNA, located in the 
5' end of the coding sequence, specifically hybridized with the long 
transcript of Sox5. A Ncol 448 bp fragment of Sox5 cDNA, which 
included the HMG box, specifically hybridized with the short and long 
transcripts of Sox5. A BgttV-HindlW 454 bp fragment of Sox6 cDNA, 
located 1.3 kb downstream of the stop codon, specifically hybridized 
with the long transcript of Sox6. An Accl-Pstl 478 bp fragment of Sox6 
cDNA, located in the 5' end of the coding sequence, specifically 
hybridized with the short and long transcripts of Sox6. The aggrecan 
probe was a 650 bp fragment of the mouse cDNA, encoding the 
C-terminal half of the G2 domain and most of the KS adjacent domain 
(Walcz et al, 1994). Other probes were as described previously (Lefebvre 
etal, 1995, 1997). 

Northern blotting 

Total RNA was isolated and analyzed in Northern blots as described 
previously (Lefebvre et al, 1997). To facilitate transfer of large RNA 
species to nylon membrane, agarose gels were treated with 50 mM 
NaOH for 15 min before blotting. RNA standards were from Gibco- 
BRL. Hybridization signals were quantified on autoradiograms using the 
Intelligent Quantifier software program from Bio Image (Ann Arbor, MI). 

In situ hybridization 

Preparation of mouse embryo sections and hybridization with sense and 
antisense RNA probes labeled with [a- 35 S]UTP were performed as 
described previously (Zhao et al, 1 997). Autoradiograms with collagen 
and Sox RNA probes were developed after 1 and 6-7 days, respectively. 
Sense probes showed no detectable signal over background. 

Expression plasmids 

Full-length mouse L-Sox5 and Sox6 coding sequences and deletions 
1 5 1/679 and 2 1 3/679 were amplified by PCR and cloned into the pcDNA- 
5'UT and pcDNA-5'UT-FLAG mammalian expression plasmids, as 
described previously for human SOX9 and mouse Sox5 and Sox4 



5731 



V.Lefebvre, P.fci and B.de Crombrugghe 

(Lefebvre et al, 1997). The FLAG epitope did not affect DNA binding 
and transactivation properties of any Sox protein (data not shown). PCR 
products were verified by DNA sequencing. To obtain deletion 32/437, 
an Ncol fragment of L-Sox5 cDNA was blunted and cloned into blunt- 
ended BarnWl and EcoK\ sites of pcDN A-5 ' UT-FL AG; a translation 
termination codon was located in the Xba\ site located downstream of 
the EcoRl site. Deletion 32/221-304/437 in L5 was obtained by cutting 
off a Pst\ fragment in deletion 32/437. Sox6B (previously called SoxLZ) 
cDNA was a gift from Tadayoshi Shiba and Shinya Yamashita (Takamatsu 
et al, 1995). Sox6A and Sox6C coding sequences were reconstituted 
by replacing the HincW fragment of Sox6B cDNA with Sox6A- and 
iSoxtfC-specific fragments, which were obtained by PCR. 

Synthesis of protein in vitro and cross-Unking 

Protein was synthesized in vitro with [ 3 5 S] methionine. For cross-linking, 
2 \i\ of protein sample was preincubated for 30 min in 7 jil of DNA- 
binding buffer, with or without 5 fmol of 2HMG oligonucleotide, and 
incubated for 10 min with 1 u.1 of 0.1% glutaraldehyde. Polypeptide 
species were separated by SDS-PAGE in a polyacrylamide gradient (5- 
12%) gel and revealed by autoradiography. 

Celt cultures 

All cell types were cultured as described previously (Lefebvre et al, 
1997). Primary chondrocytes from ribs of newborn mice were used, 
unless otherwise indicated, after 2-3 days in culture, when they were 
essentially fully differentiated (Lefebvre et al., 1994). RCS cells were 
previously shown to display a highly stable phenotype of early-stage 
chondrocytes (Mukhopadhyay et al, 1995). MC615 cells were used after 
early passage when they exhibited a fairly differentiated chondrocytic 
phenotype (Mallein-Gerin et al, 1993) or after repeated passages 
when they had essentially lost their chondrocytic phenotype. Human 
recombinant bone morphogenetic protein-2 (Genetics Institute, 
Cambridge, MA) was added to the culture medium at a concentration 
of 150 ng/ml. 

Transient transfection 

The p309 CoI2al-$geo reporter construct has been described previously 
(Zhou et al. t 1995). The $geo gene encodes a fusion protein with E.coli 
p-galactosidase and neomycin-resistance activities. Constructs p309- 
(4x48), P 309-(2x231) and p309-(2x221) were obtained by cloning wild- 
type and mutant enhancer fragments (Lefebvre et al, 1 996) into p309 
as described previously for other constructs (Zhou et al., 1995). Reporter 
constructs were cotransfected with Sox protein expression plasmids 
using lipofectamine (Gibco-BRL) (Lefebvre et al, 1997). A plasmid, 
pSV2pgal or pGL2 (Promega), was included in all transfections as an 
internal control for transfection efficiency. Reporter activities were 
determined after normalization for transfection efficiency. Reporter and 
control plasmids were transfected in a 3:1 ratio, and expression plasmids 
were included in various amounts, as indicated. Empty expression 
plasmid was added, whenever necessary, to transfect the same total 
amount of DNA in all samples. To assess expression of endogenous 
genes, cells were transfected using either lipofectamine or FuGENE 6 
(Boehringer Mannheim, Indianapolis, IN) according to the manufacturer's 
instructions. 

DDBJ/EMBL/GenBank accession numbers 

The accession Nos are: L-Sox5, AJ0 10604; Sox6, AJ0 10605. 

Nomenclature 

SOX refers to human proteins; Sox refers to mouse proteins. Genes 
are italicized. 

Acknowledgements 

This work was funded by NIH grants R01 AR42909 and P01 AR42919- 
02 to B.d.C. V.L. is the recipient of an Arthritis Investigator Award from 
the Arthritis Foundation. We thank James H.Kimura for RCS cells, 
Bjorn Olsen for MC615 cells, the Genetics Institute for BMP2, Tadayoshi 
Shiba and Shinya Yamashita for Sox-LZ cDNA, Kurt J.Doege for the 
mouse aggrecan probe, Gerard Karsenty for the MGP probe and William 
T.Butler for the osteopontin probe and ROS 17/2.8 cells. We are grateful 
to Sankar N.Maity for precious advice, Heidi Eberspaecher, Xin Zhou, 
Ni Lu and Laura C.Bridgewater for help in experimental work, and 
Shane Zhao for help in analysis of cDNA and protein sequences. DNA 
sequencing was performed by The University of Texas M.D. Anderson 
Cancer Center core sequencing facility, which is supported by NCI grant 



CA16672. We thank William H.Klein, Richard R.Behringer, Randy 
L Johnson and Sankar N.Maity for critical reviewing of the manuscript. 



References 

BeIl,D.M. et al (1997) SOX9 directly regulates the type-II collagen 

gene. Nature Genet, 16, 174-178. 
Bridgewater,L.C, Lefebvre, V. and de Crombrugghe,B. (1998) 

Chondrocyte-specific enhancer elements in the Colli a2 gene resemble 

the Col2al tissue-specific enhancer J. Biol. Chem., 273, 14998-15006. 
Cancedda,R., Descalzi Cancedda,R and CastagnoIa,P. (1995) 

Chondrocyte differentiation. Int. Rev. Cytol, 159, 265-358. 
Cheah4CS.E., Lau,E.T., Au,P.KX. and Tam,P.P.L. (1991) Expression of 

the mouse al (II) collagen gene is not restricted to cartilage during 

development Development, 111, 945-953. 
Connor,F., CaryP.D., Read,C.M., Preston,N.S., Driscoll,P.C, Denny,P., 

Crane-Robinson,C. and Ashworth,A. (1994) DNA binding and bending 

properties of the post-meiotically expressed Sry-related protein Sox- 

5. Nucleic Acids Res., 22, 3339-3346. 
Connor,F., Wright,E., Denny,P., Koopman,P. and Ashworth,A. (1995) 

The Sry-related HMG-box containing gene Sox6 is expressed in the 

adult testis and developing nervous system of the mouse. Nucleic 

Acids Res,, 23, 3365-3372. 
Denny,R, Swift,S., Connor,F. and Ashworth,A. (1992) An SRY-related 

gene expressed during spermatogenesis in the mouse encodes a 

sequence-specific DNA-binding protein. EMBO J., 11, 3705-3712. 
Erlebacher,A., FilvarofT,E.H., GiteIman,S.E. and Derynck,R. (1995) 

Toward a molecular understanding of skeletal development Cell, SO, 

271-278. 

Foster^. W. et al. (1994) Campomelic dysplasia and autosomal sex 
reversal caused by mutations in an SRY-related gene. Nature, 372, 
525-530. 

Giese,K., Kingsley,C, Kirshner,J.R. and Grosschedl,R. (1995) Assembly 

and function of a TCRa enhancer complex is dependent on LEF-1- 

induced DNA bending and multiple protein-protein interactions. 

Genes Dev., 9, 995-1008. 
GlumofT,V, Savontaus,M., VehanenJ. and Vuorio,E. (1994) Analysis of 

aggrecan and tenascin gene expression in mouse skeletal tissues by 

Northern and in situ hybridization using species specific cDNA probes. 

Biochim. Biophys. Acta, 1219, 613-622. 
Grosschedl,R., Giese,K. and PageU. (1994) HMG domain proteins: 

architectural elements in the assembly of nucleoprotein structures. 

Trends Genet., 10, 94-100. 
HaI13.K. and Miyake,T. (1995) Divide, accumulate, differentiate: cell 

condensation in skeletal development revisited. Int. J. Dev. Biol, 39, 

881-893. 

Houston,C.S., OpitzJ.M., SprangerJ.W., Macpherson,R.L, Reed,M.H., 
Gilbert^E.F., Herrman,J. and SchinzeI,A. (1983) The campomelic 
syndrome: review, report of 17 cases and follow-up on the currently 
17-year old boy first reported by Maroteaux et al in 1971. Am. J. 
Med. Genet, 15, 2-28. 

Kent,J., Wheatley,S.C, Andrews,J.E., Sinclair,A.H. and Koopman,P. 
(1996) A male-specific role for SOX9 in vertebrate sex determinatioa 
Development, 122, 2813-2822. 

Kido,S., Hiraoka,Y, Ogawa,M., Sakai,Y., Yoshimura,Y. and Aiso,S. 
(1998) Cloning and characterization of mouse mSoxJ3 cDNA. Gene, 
208, 201-206. 

Komatsu,N., Hiraoka,Y, Shiozawa,M., Ogawa,M. and Aiso,S. (1996) 

Cloning and expression of Xenopus laevis xSoxll cDNA. Biochim. 

Biophys. Acta, 1305, 117-119. 
Kwok,C. et al (1995) Mutations in SOX9, the gene responsible for 

campomelic dysplasia and autosomal sex reversal. Am. J. Hum. Genet, 

57, 1028-1036. 

Laudet,V., SteheIin,D. and Clevers,H. (1993) Ancestry and diversity of 
the HMG box superfamily. Nucleic Acids Res., 21, 2493-2501. 

Lefebvre, V. and de Crombrugghe,B. (1 998) Toward understanding SOX9 
function in chondrocyte differentiation. Matrix Biol, 16, 529-540. 

Lefebvre, V., Garofalo,S., Zhou,G., Metsaranta,M., Vuorio,E. and de 
Crombrugghe,B. (1994) Characterization of primary cultures of 
chondrocytes from type II collagen/p-galactosidase transgenic mice. 
Matrix Biol, 14, 32^-335. 

Lefebvre, V,, Garofalo,S. and de Crombrugghe,B. (1995) Type X collagen 
gene expression in mouse chondrocytes immortalized by a temperature- 
sensitive simian virus 40 large tumor antigen. J. Cell Biol, 128, 
239-245. 



5732 



Lefebvre,V. et al. (1996) An 18-base-pair sequence in the mouse proal 
(II) collagen gene is sufficient for cartilage expression and binds 
nuclear proteins that are selectively expressed in chondrocytes. Mol 
Cell. Biol, 16, 4512-4523. 

Lefebvre,V., Huang, W., Harley,V.R., Goodfellow,P.N. and de 
Crombrugghe,B. (1997) SOX9 is a potent activator of the chondrocyte- 
specific enhancer of the proal (II) collagen gene. Mol Cell Biol, 
17, 2336-2346. 

Lupas,A, (1996) Coiled coils: new structures and new functions. Trends 

Biochem. Sci., 21, 375-382. 
MalIein-Gerin,F. and 01sen,B.R. (1993) Expression of simian virus 

40 large T (tumor) oncogene in mouse chondrocytes induces cell 

proliferation without loss of the differentiated phenotype. Proc. Natl 

Acad. Sci. USA, 90, 3289-3293. 
Mansour,S., Hall,C.M, Pembrey,M.E. and Young,I.D. (1995) A clinical 

and genetic study of campomelic dysplasia. J. Med. Genet, 32, 

415-^20. 

Meyer,J. et al. (1997) Mutational analysis of the SOX9 gene in 

campomelic dysplasia and autosomal sex reversal: lack of genotype/ 

phenotype correlations. Hum. Mol. Genet, 6, 91-98. 
Morais da Si1va,S., Hacker,A., Harley,V., Goodfel!ow,P., Swain,A. and 

Lovell-Badge,R. (1996) Sox9 expression during gonadal development 

implies a conserved role for the gene in testis differentiation in 

mammals and birds. Nature Genet, 13, 62-68. 
Mukhopadhyay,K., Lefebvre,V., Zhou,G., Garofalo,S., Kimura,J.H. and 

de Crombrugghe,B. (1995) Use of a new rat chondrosarcoma cell line 

to delineate a 119-base pair chondrocyte-specific enhancer element 

and to define active promoter segments in the mouse pro-al (II) 
. collagen gene. J. Biol Chem., 270, 27711-27719. 
NgJL.-J. et al (1997) SOX9 binds DNA, activates transcription and 

coexpressed with type II collagen during chondrogenesis in the mouse. 

Dev. Biol, 183, 108-121. 
Pevny,L.H. and Lovell-Badge,R. (1997) Sox genes find their feet Curr. 

Opin. Gen. Dev., 7, 338-344. 
Reddi,A.H. (1998) Role of morphogenetic proteins in skeletal tissue 

engineering and regeneration. Nature Biotech., 16, 247-252. 
Singh ,H., Clerc.R.G. and LeBowitz,J.H. (1989) Molecular cloning of 

sequence-specific DNA binding proteins using recognition site probes. 

BioTechniques, 7, 252—261. 
Southard-Smith,E.M., Kos,L. and Pavan,W.J. (1998) SoxlO mutation 

disrupts neural crest development in Dom Hirschsprung mouse model. 

Nature Genet, 18, 60-64. 
Sudbeck,R, Schmitz,L., Baeuerle,P.A. and Scherer,G. (1996) Sex reversal 

by loss of the C-terminal transactivation domain of human SOX9. 

Nature Genet, 13, 230-232. 
Takamatsu,N., Kanda,H., Tsuchiya,I., Yamada,S., Ito,M., Kabeno,S., 

ShibaJ. and Yamashita,S. (1995) A gene that is related to SRY and 

is expressed in the testes encodes a leucine zipper-containing protein. 

Mol. Cell Biol, 15, 3759-3766. 
Vinson,C.R., LaMarco,K.L., Johnson,P.F., Landschulz,W.H. and 

McKnight,S.L. (1988) In situ detection of sequence-specific DNA 

binding activity specified by a recombinant bacteriophage. Genes 

Dev., 2, 801-806. 

Wagner,T. et al (1994) Autosomal sex reversal and campomelic dysplasia 

are caused by mutations in and around the ^related gene SOX-9. 

Cell, 79, 1111-1120. 
Walcz,E., Deak,F., Erhardt,P., Coulter,S.N., Ful6p,C, Horvath,R, 

Doege,K.J. and Glant,T.T. (1994) Complete coding sequence, deduced 

primary structure, chromosomal localization and structural analysis of 

murine aggrecan. Genome, 22, 364-371. 
Werner,M.H. and Burley,S.K. (1997) Architectural transcription factors: 

proteins that remodel DNA. Cell, 88, 733-736. 
Wong,S.S. (1991) Chemistry of Protein Conjugation and Cross-linking. 

CRC Press, Boca Raton, FL, pp. 75-145. 
Wright,E.M, Snopek,B. and Koopman,P. (1993) Seven new members 

of the Sox gene family expressed during mouse development. Nucleic 

Acids Res., 21, 744. 
Wright E., Hargrave,M.R., Christiansen, J., Cooper,L., Kun,J., Evans,T., 

Gangadharan,U., Greenfield,A. and Koopman,P. (1995) The Sry- 

related gene Sox9 is expressed during chondrogenesis in mouse 

embryos. Nature Genet, 9, 15-20. 
Wunderle,V.M., Critcher,R„ Ashworth,A. and Goodfellow^.N. (1996) 

Cloning and characterization of SOX5, a new member of the human 

SOT gene family. Genomics, 36, 354-358. 
Yamashita^A., Suzuki,S., Fujitani,K., Kojima,M., Kanda,H., Ito,M., 

Takamatsu,N., Yamashita,S. and Shiba,T. (1998) cDNA cloning of a 



Col2a1 activation by L-Sox5, Sox6 and Sox9 

novel rainbow trout SRY-type HMG box protein, rt23 and its functional 

analysis. Gene, 209, 193-200. 
Zhao,Q., Eberspaecher,H., Lefebvre,V. and de Crombrugghe,B. (1997) 

Parallel expression of Sox9 and Collal in cells undergoing 

chondrogenesis. Dev. Dynam., 209, 377-386, 
Zhou,G., Garofalo,S., Mukhopadhyay,IC., Lefebvre,V., Smith,C.N., 

Eberspaecher,H. and de Crombrugghe,B- (1995) A 182 bp fragment 

of the mouse proal (II) collagen gene is sufficient to direct chondrocyte 

expression in transgenic mice. J. Cell Sci., 108, 3677-3684. 
Zhou,G., Lefebvre,V, Zhang, Z, Eberspaecher,H. and deCrombrugghe,B. 

(1998) Three HMG-like sequences within a 48-bp enhancer of the 

Collal gene are required for cartilage-specific expression in vivo. 

J. Biol Chem., 273, 14989-14997. 

Received May 26, 1998; revised July 27, 1998; 
accepted August 3, 1998 



5733 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 



□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: ' 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




BLURRED OR ILLEGIBLE TEXT OR DRAWING 



